High on Bugs!

NewsCom: Unleashing Community Voices

Saptarshi Bhattacharya — Sun, 21 Jan 2024 18:36:14 GMT

What started as a simple project to learn GitHub Actions became a potentially powerful conduit for collaboration. NewsCom envisages diverse voices to converge to share insights, experiences, and stories. In this blog, we embark on a journey through a unique project's inception, development, and intricacies. This community-driven newsletter thrives on the collective wisdom of its contributors.

The Purpose and Goals

At the heart of this initiative lies a simple yet profound purpose: to cultivate a space where tech gets written by and for techies at minimal setup. This community newsletter is not just a platform for disseminating information; it is a testament to the power of collaborative storytelling, where the richness of shared experiences transcends boundaries. It aims to create a space with a detached content contribution system all the while maintaining a certain degree of ownership.

GitHub Authentication: A Pillar of Security and Engagement

To maintain content ownership and maintain a secure environment, we've implemented user authentication through GitHub accounts. This streamlines the contribution process and adds an extra layer of trust and accountability to the community.

By leveraging GitHub's robust authentication system, we not only enhance the security of our platform but also seamlessly integrate with a vast network of developers and enthusiasts. This unique approach not only simplifies the registration process (more like signing, in our case) but also taps into the pre-existing GitHub community, creating a familiar and welcoming environment for users.

Creating and Submitting Articles on the Webpage

With the foundation of GitHub authentication laid out in the previous chapter, we now turn our attention to the heart of our community-driven newsletter project the process of creating and submitting articles. Our dedicated webpage serves as a simple Markdown editor where contributors, armed with their GitHub identities, can weave their narratives and share their insights. Any GitHub User needs 5 steps to potentially contribute their first article:

Go to https://newscom.sbk2k1.tech/
Sign in with GitHub by clicking on "Write a Blog Yourself".
Authenticate GitHub App.
Click on "Write a Blog Yourself".
Write and Submit!

As contributors craft their articles, the integration with GitHub ensures that each iteration is tracked and worked with seamlessly. Once an article takes shape and the contributor is satisfied, our platform facilitates the submission process through a straightforward interface. This can also foster a better understanding of Git and GitHub amongst developers.

Pull Requests and Collaborative Editing

In the collaborative ecosystem of our community newsletter, the editorial process takes center stage as repository collaborators assume the role of editors. GitHub pull requests (PRs) become the conduit through which individual contributions undergo scrutiny, refinement, and ultimately, integration into the collective narrative.

Collaborators as Editors

Within our GitHub repository, a select group of individuals our repository collaborators take on the pivotal role of editors. Endowed with the responsibility of reviewing and curating content, these collaborators evaluate each pull request based on the project's editorial guidelines. Their expertise ensures that the newsletter maintains a high standard of quality, coherence, and relevance.

Automation with GitHub Actions

GitHub Actions, a powerful workflow automation tool, becomes the silent orchestrator behind the scenes, ensuring that our community newsletter project runs seamlessly. This chapter explores the role of GitHub Actions in automating critical processes, from triggering events based on article collection milestones to compiling and distributing the newsletter to our eager subscribers.

Triggering Actions: From Collection to Compilation

One of the primary functions of GitHub Actions in our project is the automatic triggering of workflows based on predefined conditions. As articles accumulate in the main branch (collecting), a GitHub Action is set to activate when a certain number is reached. This marks the commencement of the compilation process, ensuring that the newsletter evolves organically.

Email Notifications: Connecting with Subscribers

With the compiled newsletter in hand, GitHub Actions takes the next step by automating the distribution process. Subscribers, eagerly awaiting the latest edition, receive automated email notifications. This ensures timely and consistent delivery, enhancing the overall user experience and engagement.

Cleanup and Cloudinary Backup

In the spirit of meticulous housekeeping, GitHub Actions goes a step further by cleaning up the main branch post-compilation. This ensures a fresh slate for the next collection cycle. Simultaneously, a backup process sends the compiled files to Cloudinary, providing a secure archive for previous issues. This strategic backup strategy adds an extra layer of protection against data loss.

Note: Currently a user can write only 1 article per day.

Future Decisions

Currently, the project has quite a few shortcomings. The front end and UI are not the best. The IDE and the overall feel are not very user-friendly. Moreover, there is no subscribe and publish feature, since they will need cloud-hosted services (at least the database). I'll push these features when there are at least 50 article submissions.

Conclusion

The project is centered around minimal setup and simple workflow. I'll try to keep improving on that. Careful curation of articles is something I think I'd have to focus a lot on in terms of upscaling.

Technologies used

GitHub Actions: Used for automating workflows, such as triggering events based on article collection milestones, compiling Markdown articles, and distributing the newsletter.
Node.js: Backend proxy, handling publish, etc etc
React.js: The website
Octokit: A JavaScript toolkit for the GitHub API. It facilitates communication with the GitHub API, allowing seamless integration and interaction with GitHub features.
GitHub API and OAuth: GitHub API is likely used for accessing and manipulating GitHub data, while OAuth is employed for secure and standardized user authentication using GitHub accounts.
GitHub Version Control: Inherent to the GitHub platform, version control is fundamental for tracking changes, managing branches, and ensuring the integrity of the project's codebase.

Website: https://newscom.sbk2k1.tech/
GitHub Repo: https://github.com/High-on-Bugs/HoB-Community-Newsletter

If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

How to deploy your Website to GitHub Pages using GitHub Actions

Saptarshi Bhattacharya — Fri, 19 Jan 2024 09:03:29 GMT

Step 1: Organize Files

Place your HTML file, CSS file, and assets/js in a dedicated folder. Let's call it website.

The folder structure should look something like this.

   .github/workflows/deploy.yml     # we'll talk about this later       /Website/                        # All your website stuff         HTML         CSS         Js         /Other Assets/      Readme.md                        # optional - not necessary

Step 2: Understanding GitHub Actions

GitHub Actions is a powerful automation and continuous integration (CI) tool seamlessly integrated into the GitHub platform.

Utilizing declarative YAML configurations, GitHub Actions allows developers to define workflowsautomated processes triggered by various events such as code pushes or pull requests.

These workflows consist of jobs and steps, where each step represents a task, such as building, testing, or deploying code.

Check out this video to understand it better:

https://www.youtube.com/watch?v=mFFXuXjVgkU

Step 3: Create GitHub Action Workflow

Create a new folder named .github/workflows in the root of your repository, and inside it, create a file named deploy.yaml. This YAML file will define your GitHub Actions workflow.

name: Deploy to GitHub Pageson:  push:    branches:      - master # change to your main branchjobs:  deploy:    runs-on: ubuntu-latest    permissions:      contents: write    concurrency:      group: ${{ github.workflow }}-${{ github.ref }}    steps:      - name: Checkout Repository        uses: actions/checkout@v2        with:          ref: master   # again change with your main branch      - name: Init new repo in website folder and commit generated files        run: |          cd ./Website          git init          git add .          git config --local user.email "bhattacharyasaptarshi2001@gmail.com"          git config --local user.name "Saptarshi"          git commit -m 'deploy'      - name: Add safe.directory exception        run: git config --global --add safe.directory /github/workspace/Website      - name: Force push to destination branch        uses: ad-m/github-push-action@master        with:          github_token: ${{ secrets.GITHUB_TOKEN }}          branch: gh-pages          force: true          directory: ./Website

Let's understand each part and what it does.

Workflow Name and Trigger:

name: Deploy to GitHub Pageson:  push:    branches:      - master # Change to your main branch

name:: Defines the name of the GitHub Actions workflow. In this case, it's named "Deploy to GitHub Pages."
on:: Specifies the events that trigger the workflow. Here, the workflow runs on every push to the specified branches, in this case, the master branch.

Job Configuration:

jobs:  deploy:    runs-on: ubuntu-latest    permissions:      contents: write    concurrency:      group: ${{ github.workflow }}-${{ github.ref }}

runs-on:: Specifies the GitHub-hosted runner environment for the job. Here, it's set to run on the latest version of Ubuntu.
permissions:: Grants write permissions to the contents of the repository for this job.
concurrency:: Helps manage concurrent workflow runs by grouping them based on the workflow name and branch. This can prevent race conditions when deploying.

Steps:

steps:  - name: Checkout Repository    uses: actions/checkout@v2    with:      ref: master   # Change with your main branch

actions/checkout@v2:: Action that checks out the repository at the specified ref (branch). In this case, it's checking out the master branch.

    - name: Init new repo in the website folder and commit generated files    run: |      cd ./Website      git init      git add .      git config --local user.email "bhattacharyasaptarshi2001@gmail.com"      git config --local user.name "Saptarshi"      git commit -m 'deploy'

run:: Executes a series of shell commands.
cd ./Website:: Changes the working directory to the ./Website folder.
git init:: Initializes a new Git repository.
git add .:: Adds all files in the current directory to the staging area.
git config:: Sets local Git configurations for the user's email and name.
git commit:: Commits the changes with the message 'deploy.'

    - name: Add safe.directory exception    run: git config --global --add safe.directory /github/workspace/Website

git config --global --add safe.directory /github/workspace/Website:: Adds an exception for the safe.directory configuration globally to include the ./Website folder.

    - name: Force push to the destination branch    uses: ad-m/github-push-action@master    with:      github_token: ${{ secrets.GITHUB_TOKEN }}      branch: gh-pages      force: true      directory: ./Website

ad-m/github-push-action@master:: Utilizes the GitHub Push Action to force push changes to a specified branch.
github_token:: Uses the repository's GitHub token stored in secrets for authentication.
branch: gh-pages:: Specifies the branch to which the changes will be forcefully pushed.
force: true:: Enables force pushing, and overwriting existing history on the gh-pages branch.
directory: ./Website:: Specifies the directory from which to push the changes.

Step 4: Create `gh-pages` Branch

Create a new branch named gh-pages. GitHub automatically detects this branch and deploys its content to GitHub Pages. You can create the Branch for the GitHub Repository from the website You can also manually set it in the repository settings

The commands are:

git checkout -b gh-pagesgit push origin gh-pages

Go to https://github.com/sbk2k1//settings/pages

And you can manually configure it as well

Step 5: Set GitHub Token

In your repository settings, go to "Settings" > "Actions" > "General". Scroll down to the "Workflow permissions" section and select the Read and Write permissions options. Save it.

Step 6: Remove Custom Domain CNAME (Optional)

If you want to remove a custom domain from your GitHub Pages site (assuming you previously set it up and don't want it anymore), go to the URL below

https://github.com/USERNAME/USERNAME.github.io/blob/master/CNAME

And remove the CNAME inside the file.

Now, commit and push these changes, and your GitHub Actions workflow will automatically deploy your static page to GitHub Pages on the gh-pages branch!

Repository with the code and setup: https://github.com/High-on-Bugs/Deploy-Wesbite-using-Actions-and-Pages

GitHub Issue discussing expired domain problem: https://github.com/isaacs/github/issues/1213

If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Deploy your TypeScript Express App to Vercel (2024)

Saptarshi Bhattacharya — Thu, 18 Jan 2024 15:22:52 GMT

Disclaimer: This blog does not discuss express and how to build server logic. This only focuses on deploying the app to Vercel as a Serverless Function.

Step 1: Export `app` instead of listening on a certain port.

Export the app in ES6 fashion rather than app.listen()

This

export default app;

Instead of

app.listen(PORT, () => {    console.log("Server listening on port", PORT);});

Step 2: Create an `api` folder for Vercel and set it up.

Create an api folder that has an index.ts as follows:

import app from '../app';export default app;

This imports the app from your root directory (change the path if you have a different setup) and exports it for Vercel.

Step 3: Mention the API folder in `tsconfig.json`

Update tsconfig.json as follows to track the api folder

{  "compilerOptions": {    "module": "commonjs",    "esModuleInterop": true,    "target": "es6",    "rootDir": "./",    "outDir": "build",    "strict": true  },  "include": ["./api/*.ts"] // -> this is the line you need to update}

Step 4: Create the Public folder

Create an empty Public folder because Vercel looks for it during deployment. We need to keep it even though no static files are to be served.

Create a .gitkeep to track the folder

Step 5: Create `vercel.json` file

The vercel.json should look like this

{    "rewrites": [        {            "source": "/(.*)",            "destination": "/api"        }    ]}

This directs any API request received anywhere in the app is redirected to the /api folder. The /api folder uses the /api/index.ts which in turn uses the app.ts and on and on and on

Step 6: Rewrite the Build Command in `package.json`

Vercel handles all the transpilation, so we need to overwrite the build command in package.json. Create a new script called vercel-buiild which prevents the typescript compiler from being invoked and instead acts as a dummy placeholder. The script should look like this.

"vercel-build": "echo hello",

Step 7: Deploy

The app is now deployable and you can do it from the Vercel console by connecting your GitHub account.

If you have Vercel CLI installed you can check it on your local machine using the command

vercel dev

and deploy using

vercel

My Code: https://github.com/High-on-Bugs/typescript-express-vercel-tutorial

Check this video for a video tutorial:

https://www.youtube.com/watch?v=B-T69_VP2Ls

If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Setting Up My Simple Home Server: A Practical Guide

Saptarshi Bhattacharya — Sun, 14 Jan 2024 10:24:26 GMT

Chapter 1: Formatting the Old PC

Introduction

Welcome to the kickoff of my home server project! Imagine this: this forgotten PC, originally my dad's ex-workstation, is sitting idle. Armed with an Intel i3 8th gen processor, no flashy GPU, and a humble 8 gigs of RAM, I saw potential in turning it into a nifty home server.

The Appeal of Repurposing

Repurposing would let me use the home server to back up files, stream movies and media, deploy my projects, and learn some other cool stuff

Backing Up Data

Importance of Data Backup

There were not a lot of files to be backed up. So I used Google Drive to back a few of them up. The rest would get purged.

Chapter 2: Clean Slate

The Decision to Start Fresh

The chapter culminates in the decision to wipe the slate clean by formatting the existing Windows 10 installation. Reasons behind this choice include eliminating unnecessary clutter, ensuring a fresh start, and optimizing the system for its new role as a home server.

As there was a single SSD, with multiple partitions, I combined them into one and went ahead with the Linux Installation.

This video should be enough to get the steps on how to reset your Windows 10/11 machine:

https://www.youtube.com/watch?v=5OVwQdfUztU

Chapter 3: Journey to Ubuntu Installation

For Video guidance, you can use this video:

https://www.youtube.com/watch?v=oNEwEQ0uU1Y

Flashing the Drive

With the old PC prepped and ready, it was time to introduce it to its new companion - a shiny USB drive. I chose to use Balena Etcher for this job, a tool that makes flashing the drive a breeze. A few clicks, and we were set to roll.

Navigating the BIOS

Ah, the backstage BIOS pass to your PC's inner workings. Before diving into the installation, a pitstop in the BIOS was essential. Adjusting settings, ensuring compatibility, and making sure the USB drive took center stage in the boot order.

Setting Boot Priority

The boot priority dance was next on the agenda. I wanted the system to look at the USB drive first, ensuring a smooth transition from the flashing process to the Ubuntu installation.

Installation Initiated

With the USB drive ready and the boot priority set, it was time for the main event - installing Ubuntu. The familiar installation wizard guided me through the process, prompting me for language preferences, time zone settings, and user details. A few clicks later, the installation was underway.

No Ethernet Cable, No Problem

Ah, the hiccup. No Ethernet cable on hand meant no updated packages during installation. But worry not; I opted for a minimal install to keep things straightforward. I would need to set up my internet later.

And there you have it - Chapter 3! The USB drive is flashed, the BIOS is in check, and Ubuntu is making its home on the old PC. Stay tuned for the next chapter, where we tackle connection challenges and navigate the intricacies of USB tethering for internet access during and after installation. 🌐

Chapter 4: USB Tethering to Mobile Phone

Overcoming Connection Challenges During Installation

As the Ubuntu installation progressed, a familiar hurdle emerged the absence of an Ethernet connection. No Ethernet, no problem - but a solution was in order. This chapter delves into the challenges faced and the journey toward utilizing USB tethering to a mobile phone as the savior.

Introduction to USB Tethering

With no Ethernet cable in sight, the spotlight turned to USB tethering. This technology, often underutilized, allows a seamless connection between a computer and a mobile device, effectively transforming the mobile phone into a gateway to the digital realm.

Setting Up IP and Gateway via USB Tethering

The Failed Attempt with Manual Net Tools Installation

Initially, I attempted a manual installation of net tools from the official Ubuntu website. The process involved mounting the tools on a USB drive, but alas, it proved to be a cumbersome task and didn't yield the desired results. It was time to pivot.

Embracing USB Tethering

Enter USB tethering - a more straightforward and reliable solution. Connecting the mobile phone to the PC via USB cable initiated the tethering process. A quick check using ip link revealed the available network interfaces, among which the USB-tethered interface took center stage.

Configuring IP and Gateway Settings

The next step involved setting up the IP address and gateway for the USB-tethered interface. A judicious choice of values ensured a stable internet connection

Let's go through the steps:

Check available network interfaces
```
 ip link
```
Recognize the USB interface and assign an IP address manually. Assuming IP as 192.168.42.10/24 and interface as enp0s20u1
```
 sudo ip addr add 192.168.42.10/24 dev enp0s20u1 sudo ip link set enp0s20u1 up
```

Set the default gateway using the following command

 sudo ip route add default via 192.168.42.1 dev enp0s20u1

Reboot
```
 reboot
```
Try to ping an external IP to confirm if the connection works.
```
 ping 8.8.8.8
```

In case of errors, check for Firewall rules

Achieving Internet Access

With the USB tethering setup, the old PC was now online. Internet access meant that the installation process could proceed without any hiccups.

Updating Linux and Adding Net Tools

The newfound internet connection was immediately put to use. The system underwent a thorough update, ensuring that the latest Linux packages were on board. Additionally, the previously elusive net tools were seamlessly added to the arsenal, simplifying future networking tasks.

Chapter 5: RTL8812AU Driver Woes

Device Name: TP-Link Archer T2U (RTL8812AU)

Dealing with the Absence of an Official Driver

The path to a fully functional home server encountered a significant hurdle when the built-in Wi-Fi adapter, the TP-Link RTL8812AU, found no official Linux support. Undeterred, I delved into the challenge, determined to find a workaround.

Discovering Community Wisdom on Mint Forum

A ray of hope emerged when I stumbled upon a community post on the Mint forum. Fellow enthusiasts faced similar RTL8812AU woes. The community post led me to a GitHub repository housing a solution to my driver predicament. The repository contained not only the necessary driver files but also somewhat detailed instructions in the README file.

Installing the Driver Using Readme Instructions

After Cloning the repository, I followed the step-by-step instructions from the README file to install the drivers. Commands were entered, configurations were adjusted, and dependencies were resolved, all in alignment with the provided steps. With the driver installed, a quick network restart was in order. The moment of truth arrived as I eagerly awaited the appearance of the Wi-Fi interface, signaling that the RTL8812AU driver had successfully integrated with the system.

The Driver and the steps to install can be found here.

Setting Up Netplan to Configure Networking

I then configured the network settings using Netplan. This involved crafting a Netplan YAML file, specifying DHCP settings, and nameservers, and providing login credentials for the network. Executing the Netplan YAML file applied the configurations to the system. The digital gears clicked into place as the network settings were adjusted, paving the way for a robust and stable connection.

With Netplan's configurations in place, the once elusive Wi-Fi connection now offered a gateway to the internet. The old PC was now fully equipped, ready to explore the digital landscape and fulfill its role as a reliable home server.

I used this video to get my wifi to work:

https://www.youtube.com/watch?v=Dacn58kgMXA

My netplan configuration yaml looked as follows:

network:  ethernets:    enp3s0:      optional: true      dhcp4: true    usb0:      dhcp4: true  version: 2  wifis:    :      dhcp4: no      addresses: [/24]      routes:       - to: default         via:       nameservers:        addresses: [1.1.1.1, 1.0.0.1]      access-points:        :          password:

Chapter 6: Unlocking Advanced Capabilities

As the home server project unfolded, the quest for enhanced functionalities led to the introduction of several powerful features, turning the old PC into a versatile hub for various applications.

SambaShare for NAS

Embracing the concept of Network-Attached Storage (NAS), the server now boasts SambaShare integration. This enables seamless file sharing across devices within the network. Whether it's documents, media files, or backups, the NAS capabilities provide a centralized repository accessible from any connected device.

Check this video out for help with installation:

https://www.youtube.com/watch?v=0-T7af_lRF8&t=945s

Plex for Media Streaming

The entertainment dimension received a significant upgrade with the integration of Plex. Now, the server doubles as a media streaming powerhouse. Plex not only organizes the media library but also enables streaming on-demand, turning the old PC into a personal media center.

Check this video out for help with installation:

https://www.youtube.com/watch?v=QEP5Tq78cHw

Docker, Minikube, and OpenSSH for Remote Development

The journey into the world of containerization began with the installation of Docker. This enables the deployment of applications in isolated containers, ensuring efficient resource utilization and easy management. Minikube, on the other hand, introduces the capabilities of Kubernetes at a smaller scale, providing a robust platform for container orchestration.

Enabling secure remote access for development purposes, OpenSSH was configured. This feature facilitates a secure shell connection, allowing developers to access and manage the server remotely. Coupled with Visual Studio Code's Remote SSH extension, the development workflow is further streamlined, providing a seamless and efficient coding environment.

Check out Hitesh Choudhary and other tutorials on YT to install these!

Chapter 7: Conclusion

Finally, I'm in a stage where I can shift my entire development workload onto a Linux system to avoid the incessant windows-shaming I've been facing over the years. Having a server ready can also help me get a better grip on computer networks and Linux filesystems as well as learn advanced DevOps topics.

If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Striking the Right Chord: Gaming and Beyond with Python-Powered Audio Magic

Saptarshi Bhattacharya — Thu, 05 Oct 2023 18:05:40 GMT

Okay! I agree the title sounds too overly technical. Long story short, I made a program to play Counter-Strike using my guitar. Why write a blog about a program that lets you play Counter-Strike with a guitar? Well, truth be told, it doesn't have much practical use. So why bother? The reason is to spotlight the valuable building blocks within the project that could benefit you.

If you want to check the software out for yourself, visit the SOWS Website.

Here is an early usage video of the software.

https://www.youtube.com/watch?v=NNfp-58yXsA

So let us get into answering the questions of how and why.

Introduction

Capabilities

Driver-Swapping Wizardry: This thing can seamlessly switch between different audio drivers. Whether you're rocking out with your trusty headphones or getting all fancy with an audio interface, it's got your back.
Audio Detective Mode: You can stop actions and see the notes you are producing. So in theory you can replace your guitar tuner. (Not Recommended)
Musical Notes Meet Gaming Actions: Here's the kicker it takes those signals, turns them into musical notes, and then maps those notes to in-game actions. Picture strumming a power chord to toss a grenade or hitting a sweet riff to reload your weapon. It's not that crazy. (Yet?)

Motivation Behind the Madness:

Now, you're probably wondering why in the world someone would come up with this concoction. Well, I wanted to merge two things I absolutely adore: jamming on the guitar and going all out in Counter-Strike. The wild and wacky creations of developers like Michael Reeves were a catalyst. He showed me that the craziest ideas can lead to the most fun and innovative projects.

I. The Foundations

Section 1.1: Python Object-Oriented Programming (OOP)

Importance of OOP in Software Development: Object-oriented programming (OOP) is the backbone of many modern software projects, and it plays a crucial role in making code more organized, modular, and maintainable. It's like building with Lego bricks; you create reusable components (objects) with their own data and behavior, making it easier to manage complexity as your project grows.

Project Structure Using Python Classes and Objects: In my project, we've leveraged OOP to structure the code effectively. Let's take a quick peek at how it's done:

class Ui_MainWindow:    # This class handles the user interface of my application.    # It's structured using Qt Designer and PyQt5.class CustomEventFilter:    # CustomEventFilter is a class designed to filter and process user input events.    # It prevents keyboard mappings from interacting with the program itselfclass ActionHandler:    # ActionHandler is responsible for mapping audio signals to specific in-game actions.    # It encapsulates the logic for translating guitar sounds into game commands.class AudioProcessingThread:    # AudioProcessingThread is a separate thread that handles audio detection and processing.    # This ensures that audio-related tasks don't block the main application thread.

Code Examples Illustrating Key OOP Concepts: Let's delve into some code snippets to see OOP concepts in action within my software:

# Example of Encapsulationclass ActionHandler:    def __init__(self):        self.actions = {}  # This dictionary encapsulates our actions and their corresponding mappings.# Example of Inheritanceclass CustomEventFilter(QEventFilter):    def __init__(self):        super().__init__()  # We inherit the behavior of QEventFilter to customize event handling.# Example of Polymorphismclass AudioProcessingThread(QThread):    def run(self):        # Here, we override the 'run' method to provide our behavior while utilizing QThread's functionality.

Section 1.2: Desktop Application with PyQt5 and PySide

Choice of PyQt5 and PySide: I opted for PyQt5 and PySide to create the desktop application because of my past (yet limited) experience working with it for Mnemosyne.

Creating a Desktop App in Python: Building a desktop app in Python using PyQt5 and PySide involves several steps:

Designing the UI: We've used Qt Designer to design the user interface. It allows for a visual drag-and-drop interface design, which simplifies the process.
Creating the Main Window: We've created a Ui_MainWindow class to set up the main window of the application. This class is generated based on previously created UI design.
Event Handling: The CustomEventFilter class handles user input events, ensuring that the app is not affected by actions triggered by the audio.
Threading: To keep the app responsive, we use the AudioProcessingThread class to handle audio detection in a separate thread, preventing the main thread from being blocked.

Showcasing the User Interface: My user interface, designed with PyQt5 and PySide, provides an intuitive way to interact with the application. Users can select audio drivers, visualize audio input, start a test mode to check out the note detected, and review the controls.

That's the foundation of the project structure and how I harnessed OOP principles to keep things organized and manageable while creating a desktop application using PyQt5 and PySide. These elements work in harmony to bring the magic of playing Counter-Strike with a guitar to life!

II. Audio Processing

Section 2.1: Audio Driver Integration

Interacting with Audio Drivers: In the heart of the software, there's a crucial component that enables the magic to happen - the interaction with audio drivers. This interaction allows us to tap into the audio data from your audio driver and make sense of it. Here's a glimpse of how it works:

Our software utilizes the PyAudio library to manage audio input. With PyAudio, we can establish connections with audio drivers, be it your headphone drivers or any audio interface you prefer. This gives us access to the raw audio data that flows through your system.

Challenges and Considerations: Working with audio interfaces presents its fair share of challenges. Ensuring compatibility across a wide range of drivers and hardware configurations can be tricky. We must consider issues like latency, device selection, and data format when dealing with these interfaces.

To tackle these challenges, I configured the software with the following settings:

# Audio settingsself.buffer_size = 1024self.pyaudio_format = pyaudio.paFloat32self.n_channels = 1self.samplerate = 48000self.lowest_pitch = 5self.testing = False

These settings help us ensure that the audio input is processed efficiently and that the guitar notes are captured accurately.

Code Snippets: Here's an example of how we set up the audio stream with PyAudio in the software:

# Initialize PyAudioaudio = pyaudio.PyAudio()# Configure audio streamself.buffer_size = 1024self.pyaudio_format = pyaudio.paFloat32self.n_channels = 1self.samplerate = 48000# Open an audio streamself.stream = audio.open(format=self.pyaudio_format,                         channels=self.n_channels,                         rate=self.samplerate,                         input=True,                         frames_per_buffer=self.buffer_size)

This code establishes a connection with the audio driver, configures the audio stream, and prepares to receive audio data from your guitar.

Section 2.2: Fourier Transform for Frequency Analysis

Introduction to Fourier Transform*:* Imagine you have a complex sound, like the music from your guitar. This sound is made up of various individual musical notes. It is a mixture of different waveforms with varying frequencies rather than a single waveform with consistent identifiable parameters.

Here's how it works:

Sound Waves as Building Blocks: Sound is a wave, and complex sounds are made up of simpler waveforms. Each musical note you play on your guitar can be thought of as a unique waveform.
Breaking Down the Sound: The Fourier Transform takes the complex sound and breaks it down into these simple waveforms, sort of like taking a big jigsaw puzzle and separating it into its individual pieces.
Frequency Analysis: Each of these simple waveforms represents a specific musical note, and they have different frequencies.
Quantifying the Notes: The Fourier Transform quantifies how much of each of these simple waveforms is present in the complex sound. It tells us, "Hey, you've got a lot of this note, a little of that note," and so on.

So, in a nutshell, the Fourier Transform is like a magical tool that takes a complex sound, dissects it into its individual musical notes, and tells us how much of each note is in there. It's like breaking down a song into its musical ingredients, and it's incredibly useful in understanding and working with sounds in various fields, from music to engineering.

You can check out this amazing video from 3B1B to get it:

https://www.youtube.com/watch?v=spUNpyF58BY&t=850s

Role of Fourier Transform*:* The Fourier Transform is a go-to technique for dissecting audio signals. It breaks down complex sound waves into their individual frequency components. For us, this means identifying the notes your guitar is playing by examining the frequency of the sound.

Fourier Transform Implementation*:* Here's a high-level overview of how we implement the Fourier Transform in the software:

import aubio# Initialize the Fourier Transform objectpitches = aubio.pitch("yin", self.buffer_size, self.buffer_size, self.samplerate)# Perform the Fourier Transform on audio datapitches, conf = pitches(self.audio_data)# Extract the pitch (frequency) informationpitch_frequency = pitches[0]

In this code snippet, we use the Aubio library to create a pitch detection object. We then feed it the audio data, and it returns the pitch or frequency information. This frequency data is what we use to map guitar notes to in-game actions.

So, there you have it! We've unveiled how the software handles audio drivers, grapples with audio interfaces, and employs the Fourier Transform to turn your guitar sounds into a symphony of gaming actions. With these elements in play, you're one step closer to rocking out in Counter-Strike like never before!

III. Mapping Audio Frequencies to Actions

Mapping Audio Frequencies to Actions*:* Now, let's delve into the exciting part mapping the frequencies generated by your guitar to specific keyboard and mouse actions within the software. This is where the magic happens! We've got a dictionary of musical notes, and each note corresponds to a unique action, such as moving, shooting, or jumping.

Examples of Frequency-to-Action Mapping*:* Here's a sneak peek at how some musical notes translate into actions:

When you play the note "C," it's like moving forward in the game.
If you hit "D," it's equivalent to moving backward.
"G#" on your guitar will make your character jump in the game.
"A#" triggers shooting, while "B" initiates a reload.

These mappings allow you to control your game character by playing your guitar. It's like turning your guitar into a gaming controller!

Toggles and Logic: But hold on, there's more! The software incorporates toggles and logic to make the gameplay experience smoother. For instance, if you play the "A" note, it toggles crouching on or off. So, the first "A" press crouches, and the next one stands your character back up. This smart logic ensures that you have control over these actions without any fuss. Moreover, we control 2D movement using a List of Movements across X and Y dimensions. Let's break down the logic and usage of lists in these functions:

# function to move forward or stop moving backwardsdef moveForward(self):    if self.navigation_mapping[0] == None:        self.navigation_mapping[0] = 1        keyboard.press('w')    elif self.navigation_mapping[0] == -1:        self.navigation_mapping[0] = None        keyboard.release('s')

this code, we have a function called moveForward that is responsible for moving your character forward in the game. Here's how it works:

self.navigation_mapping is a list used to keep track of your character's movement in two directions: forward/backward (index 0) and left/right (index 1).
When you call moveForward, the function first checks if self.navigation_mapping[0] is None. This check ensures that if you're already moving backward (indicated by -1), you won't start moving forward again immediately.
If self.navigation_mapping[0] is indeed None, it sets it to 1 to indicate that your character is now moving forward. Additionally, it simulates a key press of the 'w' key using keyboard.press('w').
However, if self.navigation_mapping[0] is -1, it means your character is currently moving backward. In this case, the function sets self.navigation_mapping[0] back to None to stop moving backward and releases the 's' key to stop moving in that direction.

This logic ensures that your character can smoothly switch between moving forward and stopping moving backward, allowing for responsive and intuitive control.

Similar logic applies to the moveBackwards and moveLeft functions, where the list self.navigation_mapping is used to keep track of the character's movement in different directions, and key presses and releases are simulated accordingly to provide seamless control over your character's movement.

IV. Practical Applications

Section 4.1: Practical Applications

Nothing! Helping you understand the different parts used in making the software and instigating ideas.

Section 4.2: Creative Possibilities

Brainstorm Your Own Ideas: I encourage you to think outside the box. This software is a canvas for your creativity. Pick out individual parts and apply them to your own ideas.

V. Conclusion and Further Exploration

Summarizing the Journey*:* In this blog post, we've embarked on a journey into the world of using your guitar as a gaming controller. We've explored the technical aspects, mapped out actions, and even dabbled in the potential beyond just gaming with your guitar.

Unleash Your Creativity*:* The software is a starting point for your own tech adventures. We invite you to explore each element in more detail. Dive into Python Object-Oriented Programming, discover the wonders of audio processing, and experiment with game development or other fields. Here are some resources to get you started:

Link to Python Documentation: Learn more about Python, and the language behind the software.
Link to PyQt Documentation: Learn about the framework used to make GUIs
I'm sure you can look around for other stuff yourself. :)

Call to Action

Now, it's your turn! Download the software, give it a whirl, and share your experiences with us. We'd love to hear your feedback and suggestions for improvement. Join us in this journey of creativity and innovation, where music meets technology in exciting ways. Who knows what you'll come up with next?

I can also make it an Open-Source Project. Some features I would love to add.

Customizable Actions.
A Desktop overlay while playing games that shows active buttons.
Suggest your own.

My Socials - here!

If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Introduction to MongoDB - Part 1

Saptarshi Bhattacharya — Fri, 17 Mar 2023 20:28:24 GMT

MongoDB is a popular NoSQL document database that stores data in flexible, JSON-like documents. Unlike traditional SQL databases, MongoDB does not require a predefined schema, making it easier to store and query data of varying types and structures.

One of the main reasons why MongoDB is a popular choice for backend development is its scalability. MongoDB is designed to scale horizontally by adding more servers to handle increased traffic and data volume, making it an ideal choice for large-scale, high-traffic applications. Additionally, MongoDB's flexible data model makes it easier to adapt to changing business needs, reducing development time and effort.

Another key advantage of MongoDB is its ease of use. With MongoDB, developers can write queries using a simple syntax similar to JSON, which is easier to learn and understand than SQL. Additionally, MongoDB's query language supports a wide range of operations and allows for complex queries, making it well-suited for data analytics and reporting.

Finally, MongoDB is highly versatile and can be used in a wide range of applications, including web and mobile applications, content management systems, and Internet of Things (IoT) devices. Its compatibility with popular programming languages and frameworks, including Node.js and the MERN stack, makes it a popular choice for modern web application development.

How is MongoDB different?

MongoDB is a popular NoSQL document-oriented database that stores data in a flexible, JSON-like format called BSON. Unlike traditional relational databases, MongoDB does not require a predefined schema or structure for data storage, making it easier to store and query data of varying types and structures.

Some of the key differences between MongoDB and relational databases/SQL databases include:

Data model: MongoDB uses a document-based data model, whereas SQL databases use a table-based model. In MongoDB, documents are stored in collections, which can contain different types of data, while SQL databases require data to be structured in tables with predefined columns.
Schema design: MongoDB does not require a predefined schema for data storage, whereas SQL databases require a schema to be defined in advance. This means that MongoDB can be more flexible when it comes to schema design, as changes can be made on-the-fly without affecting the overall database structure.
Query language: MongoDB uses a simple, JSON-like query language, whereas SQL databases use SQL (Structured Query Language), which is more complex and requires knowledge of database schemas and table structures.

Advantages of MongoDB:

Scalability: MongoDB is highly scalable and can easily handle large amounts of data and high levels of traffic by distributing data across multiple servers.
Flexibility: MongoDB's flexible data model allows for easier schema design and more efficient storage and querying of complex data structures.
Performance: MongoDB's indexing and sharding capabilities enable faster query times and efficient data retrieval.
Open source: MongoDB is an open-source platform with a large community of developers, making it easy to find support and resources.

Disadvantages of MongoDB:

Complexity: While MongoDB's flexible data model is an advantage, it can also be a disadvantage as it can be more complex to manage and query than a structured, relational database.
Memory usage: MongoDB requires more memory than traditional SQL databases, as it relies heavily on in-memory caching for performance.
Lack of transaction support: MongoDB does not support ACID transactions, which can make it more challenging to ensure data consistency and integrity in some use cases.

Terminologies

Some of the terminologies used in MongoDB are:

Database: A MongoDB database is a container for collections of documents. Each database can have multiple collections, and collections can have multiple documents.
Collection: A MongoDB collection is a group of documents that share a similar structure. Collections are analogous to tables in a relational database, but they do not have a predefined schema or structure.
Document: In MongoDB, a document is a record stored in a collection. A document is represented as a JSON-style object and can contain fields of various data types.
Schema: A MongoDB schema defines the structure of documents within a collection, including the fields and data types for each field. Unlike a traditional database schema, MongoDB schemas are flexible and can be changed dynamically.
Model: In the context of Node.js and MongoDB, a model is a JavaScript object that represents a collection in MongoDB. A model provides an interface for querying and modifying documents in a collection.
Index: A MongoDB index is a data structure that improves the speed of data retrieval operations by allowing queries to quickly locate documents that match certain criteria.
Aggregation: MongoDB aggregation refers to the process of combining multiple documents from one or more collections to perform a set of data processing operations, such as filtering, sorting, and grouping.

These are some of the most commonly used terminologies in MongoDB, but there are many others as well.

How to design a Schema?

A schema is a blueprint that defines the structure of documents within a collection. Unlike traditional databases, MongoDB allows for flexible schemas, which means that documents within a collection can have different fields and data types.

Fields in MongoDB refer to the individual pieces of data stored within a document. A field consists of a name and a value. The name of a field is a string that identifies the data stored within it, and the value can be of any data type supported by MongoDB.

Types in MongoDB refer to the various data types that can be used to represent data within a field. Some common data types in MongoDB include:

- String This is the most commonly used datatype to store the data. The string in MongoDB must be UTF-8 valid.
  - Integer This type is used to store a numerical value. Integer can be 32-bit or 64-bit depending upon your server.
  - Boolean This type is used to store a boolean (true/ false) value.
  - Double This type is used to store floating point values.
  - Min/ Max keys This type is used to compare a value against the lowest and highest BSON elements.
  - Arrays This type is used to store arrays or lists or multiple values into one key.
  - Timestamp This can be handy for recording when a document has been modified or added.
  - Object This datatype is used for embedded documents.
  - Null This type is used to store a Null value.
  - Symbol This datatype is used identically to a string; however, its generally reserved for languages that use a specific symbol type.
  - Date This datatype is used to store the current date or time in UNIX time format. You can specify your own date time by creating an object of Date and passing a day, month, or year into it.
  - Object ID This datatype is used to store the documents ID.
  - Binary data This datatype is used to store binary data.
  - Code This type is used to store JavaScript code in the document.
  - Regular expression This datatype is used to store regular expressions.

MongoDB also supports "pre" and "post" functions that allow developers to add custom functionality to document and query operations. "Pre" functions are executed before a specific operation, such as insert or update, and can be used to perform validations or transformations on the data being modified. "Post" functions are executed after a specific operation, such as insert or find, and can be used to perform additional processing on the result data.

These functions can be defined using MongoDB's built-in functions or custom JavaScript functions. They can be used to perform a wide variety of operations, such as data validation, transformation, logging, and more.

Querying and Manipulation

MongoDB offers powerful querying and aggregation capabilities, as well as support for indexing, to enable efficient and flexible data retrieval and manipulation.

Querying: MongoDB supports a wide range of query operators and methods for retrieving documents from collections based on specific criteria. Queries can be based on a single field, a combination of fields, or even nested fields within a document. MongoDB also supports a flexible query language that allows for complex logical expressions and regular expressions.

Aggregation: MongoDB's aggregation pipeline provides a flexible and powerful way to perform complex data transformations and analysis on collections. The pipeline consists of stages that can be used to filter, transform, group, and aggregate data in a variety of ways. Each stage in the pipeline takes input from the previous stage and passes the output to the next stage. This playlist by Bogdan Stashchuk has everything you need to get started working with MongoDB aggregation.

Indexing: MongoDB supports a wide range of indexing options to improve query performance and data retrieval times. Indexes can be created on single fields or combinations of fields within a collection. MongoDB supports several types of indexes, including unique indexes, text indexes, and geospatial indexes. Indexes can significantly improve query performance, especially for large collections, and can also support efficient sorting and range queries.

Scaling Techniques

Scaling MongoDB to handle larger datasets requires some or a combination of some techniques. They are:

Sharding: Sharding is a technique for horizontally scaling MongoDB across multiple servers. With sharding, data is distributed across multiple shards, which are groups of servers that each contain a subset of the data. Sharding can help to improve performance and handle larger datasets by allowing MongoDB to distribute the workload across multiple servers.
Replication: Replication is a technique for ensuring high availability and fault tolerance in MongoDB. With replication, multiple copies of the data are stored across multiple servers. Changes made to the data on one server are automatically replicated to the other servers in the replication set. This can help to improve performance and ensure that data is always available, even in the event of a server failure.
Indexing: Indexing is a technique for improving query performance by creating indexes on fields within a collection. Indexes allow MongoDB to retrieve and sort data efficiently, which can help to improve performance and reduce query times.
Compression: MongoDB supports several compression techniques that can be used to reduce the size of the data stored in the database. Compression can help to reduce storage requirements and improve performance, especially for larger datasets.
Caching: Caching is a technique for improving performance by storing frequently accessed data in memory. MongoDB supports several caching mechanisms, including the WiredTiger cache, which can help to improve performance and reduce query times.
Load balancing: Load balancing is a technique for distributing incoming traffic across multiple servers to improve performance and reduce the risk of overloading any individual server. Load balancing can help to improve performance and handle larger datasets by distributing the workload across multiple servers.

I hope this was an informative blog on MongoDB. There will be a second part that will build a much more practical understanding of MongoDB and its capabilities. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Backend Development - Part 1

Saptarshi Bhattacharya — Fri, 17 Mar 2023 13:24:11 GMT

This series is the one I've been preparing for till now. The REST series and the Web Basics series all build up to this series. So, this is the blog that will kick off the Backend Development series. Let's first see what we are working with and what we'll be building.

The Tech Stack

MongoDB is a popular NoSQL database that provides scalability and flexibility for data management. Node.js is a powerful runtime environment for executing JavaScript code on the server side, while Express is a minimalist web framework for building APIs and web applications.

Together, these technologies provide a robust, scalable, and efficient solution for building a backend for a blog. In this guide, we will explore the fundamentals of using MongoDB, Node.js, and Express to build a backend for a blog, covering everything from setting up the environment to creating RESTful APIs and integrating them with a front end. So, let's get started!

What are we building?

We'll be building a user login system, which is a fundamental feature in most web applications, allowing users to securely access their personal information and data. However, building a user login system can present some challenges, such as user authentication, security, and session management.

One of the first challenges in building a user login system is implementing a secure authentication process. This can be accomplished through the use of secure encryption and hashing techniques, such as "bcrypt", to store passwords and compare them to user inputs.

Another challenge is managing user sessions to maintain the user's authentication state across multiple requests. This can be achieved through the use of cookies or JSON Web Tokens (JWTs), which store session data on the client side and allow for secure authentication and authorization.

Technologies commonly used in building a user login system include Node.js, Express, and MongoDB for backend development, and frameworks for the front-end side of things.

For authentication, technologies like JSON Web Tokens can be used. Sessions can be managed with the help of packages like Express-session or JWT tokens.

Pre-requisites

Some of the Prerequisites for the series are:

JavaScript: JavaScript is the primary programming language used in building web applications with Node.js and Express. A basic understanding of JavaScript fundamentals, including data types, functions, and control flow, is necessary.
Node.js: Knowledge of Node.js is quite important. Check out my blog on Nodejs for a quick refresher.
RESTful APIs: Building a blog requires creating RESTful APIs to handle requests and responses. Understanding RESTful API design principles and how to create endpoints for a web application is necessary. Check out my REST series.
HTML, CSS, and front-end frameworks: Building a blog also requires knowledge of HTML and CSS for creating a user interface. Knowledge of front-end frameworks such as React is helpful for building dynamic and responsive user interfaces.
Basics of how the Web Works: Check out my series on Web Basics as well.

We'll figure out and explain other requirements as we go along.

That's it! We'll start with installation and setup in the next blog. This was just a small blog to update on the Backend Dev blog series, that I've been preparing for a long time. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Introduction to Machine Learning

Saptarshi Bhattacharya — Thu, 16 Mar 2023 13:38:28 GMT

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves teaching machines to learn from data without being explicitly programmed. The machine learning algorithms automatically learn patterns and insights from the data and use that learning to make predictions or decisions on new and unseen data. It is a data-driven approach that enables computers to improve their performance on a specific task as they gain more experience, without human intervention.

Machine Learning is needed because it enables computers to learn and improve from experience and data, which can lead to more accurate and efficient predictions, decision-making, and automation of tasks. For example, machine learning is used in image recognition, speech recognition, natural language processing, fraud detection, recommendation systems, and many other applications that rely on pattern recognition and prediction.

How is it different from AI, Deep Learning, and Data Science?

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves the use of algorithms and statistical models to enable machines to learn from data without being explicitly programmed. ML algorithms can be classified into three main types: supervised learning, unsupervised learning, and reinforcement learning.

Artificial Intelligence (AI) is a broader field that aims to create machines that can perform tasks requiring human-like intelligence, including problem-solving, decision-making, and natural language understanding. AI is composed of multiple subfields, including ML, natural language processing, robotics, computer vision, and more.

Deep Learning (DL) is a subset of ML that uses artificial neural networks with multiple layers to learn and extract high-level representations of data. DL has been used to achieve state-of-the-art results in tasks such as image recognition, natural language processing, and speech recognition.

Data Science (DS) is an interdisciplinary field that involves the extraction, analysis, and interpretation of data using statistical and computational methods to extract insights and knowledge from data. DS combines skills and techniques from various fields, including statistics, mathematics, computer science, and domain-specific knowledge, to derive insights and make decisions from data.

In summary, AI is the broadest field that encompasses ML and DL, which are specific subfields of AI. Data Science is a separate field that uses statistical and computational techniques to extract insights and knowledge from data. While they are all related, each field has its own unique focus and set of tools and techniques.

Types of Machine Learning

There are three main types of Machine Learning: Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

Supervised Learning: This type of machine learning involves training the model using a labeled dataset, which means the input data is already labeled with the desired output. The model learns to predict the output for new and unseen data based on the patterns and relationships it has learned from the labeled data. Some examples of supervised learning include:

Image Classification: Identifying whether an image contains a cat or a dog
Spam Detection: Classifying emails as spam or non-spam
Sentiment Analysis: Predicting whether a review is positive or negative

Unsupervised Learning: This type of machine learning involves training the model on an unlabeled dataset, which means the input data is not labeled with the desired output. The model learns to identify patterns and relationships in the data without any guidance, and it groups similar data points together based on their similarities. Some examples of unsupervised learning include:

Clustering: Grouping customers based on their purchase history or behavior
Anomaly Detection: Identifying unusual patterns or outliers in a dataset
Dimensionality Reduction: Reducing the number of features in a dataset while retaining the most important information

Reinforcement Learning: This type of machine learning involves training the model to make decisions based on trial and error. The model learns by interacting with its environment and receiving feedback in the form of rewards or penalties based on its actions. The goal of the model is to maximize the rewards it receives by learning from its mistakes. Some examples of reinforcement learning include:

Game Playing: Learning to play chess, go or other games by playing against itself or other opponents
Robotics: Learning to perform tasks such as walking, grasping objects, or navigating a maze
Recommendation Systems: Learning to recommend products or content based on user feedback and preferences.

Examples

There are many real-life applications of Machine Learning (ML) across various industries. Here are some examples:

Image and speech recognition: Image and speech recognition technology uses ML algorithms to enable machines to recognize and understand images and spoken language. Some common applications of this technology include virtual assistants like Siri and Alexa, facial recognition technology used in security systems, and self-driving cars.
Fraud detection: Financial institutions and credit card companies use ML algorithms to detect fraudulent transactions in real time. This technology enables companies to quickly identify and stop fraudulent transactions, protecting both the company and the customer.
Healthcare: ML is used in healthcare to improve patient outcomes and reduce costs. It can be used to identify patterns in patient data to enable early detection and treatment of diseases, as well as to develop personalized treatment plans.
Manufacturing: Manufacturing companies use ML algorithms to optimize their production processes, reduce waste, and improve product quality. For example, Tesla uses ML algorithms to improve the efficiency of their battery manufacturing process and to develop their autonomous driving technology.
Customer service: Many companies use ML algorithms to improve their customer service operations. This can include chatbots that can answer customer questions and resolve issues, as well as personalized marketing campaigns that are tailored to each individual customer's preferences and behavior.

This was just a short intro for the ML series that I will start (eventually). There isn't much to learn from this blog, but it can serve as an intro for someone just starting out. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Web Basics - Part 4

Saptarshi Bhattacharya — Wed, 15 Mar 2023 10:29:56 GMT

What are Cookies, Local Storage, and Session Storage?

Cookies, local storage, and session storage are all ways to store data in a user's browser while they are interacting with a website.

Cookies are small text files that are stored on a user's computer by a website they visit. They can be used to remember information about the user, such as their login details or their preferences for using the website. Cookies are important for web development because they allow websites to provide personalized experiences for users and to track user behavior for analytics purposes.

Local storage is a way to store larger amounts of data in a user's browser than is possible with cookies. Local storage is designed to be used for data that needs to persist beyond a single session, such as user preferences or settings. Local storage is important for web development because it provides a way to store data on the client side of a website, reducing the need to constantly request data from the server.

Session storage is similar to local storage, but the data it stores is only available for the duration of a user's session on a website. This means that once the user closes their browser or navigates away from the website, the data stored in session storage is deleted. Session storage is important for web development because it provides a way to store data temporarily during a user's session, such as items in a shopping cart or form data.

Cookies

Cookies are small text files that are stored on a user's computer or device when they browse a website. These files contain information about the user's activities on the website, including preferences, settings, login information, and browsing history. Cookies are designed to enhance the user's experience by making it easier and faster for them to access the website's features and content.

There are several types of cookies:

Session cookies: These cookies are temporary and are deleted when the user closes their browser. They are used to remember the user's preferences and settings during a single session.
Persistent cookies: These cookies are stored on the user's computer even after they close their browser. They are used to remember the user's preferences and settings for future sessions.
First-party cookies: These cookies are set by the website that the user is visiting.
Third-party cookies: These cookies are set by a third-party website, such as an advertising network or analytics service.

Some examples of cookies include:

Authentication cookies: These cookies are used to remember a user's login information, such as their username and password.
Shopping cart cookies: These cookies are used to remember the items that a user has added to their shopping cart on an e-commerce website.
Analytics cookies: These cookies are used to track a user's behavior on a website, such as which pages they visit and how long they stay on each page.
Advertising cookies: These cookies are used to display targeted ads to users based on their browsing history and interests.
Social media cookies: These cookies are used to integrate social media features into a website, such as the ability to share content on social media platforms.

Local Storage

Local storage is a web technology that allows web applications to store data on the client-side (user's browser) beyond the lifetime of a single session. It is a way for web developers to save information that persists even after the browser is closed, allowing the user to pick up where they left off the next time they visit the website.

Local storage works by providing a key-value storage mechanism for data. The data is stored in the user's browser in a separate storage area than cookies, with a much larger storage capacity. The data is stored as strings, but it can be converted to other data types using JavaScript methods.

The data stored in local storage is specific to the domain and protocol of the website, meaning that it cannot be accessed by other websites. It is also accessible to all scripts running on the website, making it a useful tool for sharing data between different parts of a web application.

Local storage is needed for several reasons:

Persistent data storage: Local storage allows web applications to store data that persists beyond the lifetime of a single session. This means that users can come back to the website at a later time and pick up where they left off, without losing any data.
Improved performance: Local storage can improve the performance of web applications by reducing the need for server requests. By storing data locally, the website can access it more quickly and efficiently, without having to make requests to the server.
Offline functionality: Local storage can be used to store data that is needed for offline functionality. For example, a web application that needs to be accessed in areas with poor internet connectivity can use local storage to store data that can be accessed even when the user is offline.
Enhanced user experience: Local storage can be used to store user preferences, settings, and other data that can improve the user experience. By storing this data locally, the website can personalize the user's experience and provide a more seamless experience overall.

Here are some examples of how local storage can be used:

Remembering user preferences: A website can use local storage to remember a user's preferences, such as their preferred language, font size, or theme. This allows the website to provide a more personalized experience for the user.
Storing form data: When a user is filling out a form on a website, local storage can be used to save their progress. This way, if the user accidentally closes the browser or navigates away from the page, they can come back and continue where they left off.
Saving game progress: Online games can use local storage to save a player's progress. This allows the player to come back later and resume playing from where they left off, without losing any progress.
Cache management: Local storage can be used to store frequently used data, such as images or other assets, to reduce the number of server requests. This can improve the performance of the website and reduce page load times.
Offline functionality: Web applications can use local storage to store data that is needed for offline functionality. For example, a note-taking application can use local storage to store the user's notes, which can be accessed even when the user is offline.

Session Storage

Session storage is a web storage technology that allows web applications to store data on the client-side (user's browser) for the duration of a single session. A session is defined as the time period between when a user opens a website and when they close their browser or navigate away from the website. Session storage provides a way for web developers to store data temporarily, for use during the current session.

Session storage works by providing a key-value storage mechanism for data, similar to local storage. The data is stored in the user's browser and is specific to the domain and protocol of the website. However, unlike local storage, the data stored in session storage is deleted when the user closes their browser or navigates away from the website.

Here are some examples of how session storage can be used:

Anonymous Shopping cart: Session storage can be used to store the items in a user's shopping cart during a single session without logging in. This allows the user to add and remove items from their cart without losing any data.
Page state: Session storage can be used to store the state of a webpage during a session. For example, if a user is on a webpage that allows them to filter results, session storage can be used to store their filter preferences so that they can be applied to subsequent searches.

Other Data Storages

Browser cache, IndexedDB, and Web SQL are three web technologies that can be used to store data on the client-side (user's browser) to improve website performance and provide offline functionality.

Browser cache: A browser cache is a mechanism used by web browsers to temporarily store web page data, such as HTML, CSS, and JavaScript files, images, and other assets. When a user visits a web page, the browser checks if it has cached versions of the requested resources. If cached versions exist, the browser loads them from the cache instead of making a new request to the server, which can significantly reduce page load times and improve website performance. The usefulness of browser cache is that it reduces the number of requests to the server, which in turn reduces server load and bandwidth usage. It also improves the user experience by allowing pages to load faster.
IndexedDB: IndexedDB is a client-side database technology that allows web developers to store and retrieve large amounts of structured data in the user's browser. It is designed to be a scalable storage solution that can store large amounts of data and work efficiently with large datasets. IndexedDB works by storing data in key-value pairs, with the ability to create indexes for efficient data retrieval. Data stored in IndexedDB can be queried using various APIs, allowing web applications to manipulate and retrieve data as needed. The usefulness of IndexedDB is that it provides a way for web developers to store large amounts of data on the client side, which can improve website performance by reducing the number of server requests. It also provides offline functionality, allowing web applications to continue working even when the user is not connected to the internet.
Web SQL: Web SQL is a deprecated client-side database technology that allowed web developers to store structured data in the user's browser. It used an SQLite database engine to provide a SQL-like interface for storing and retrieving data. The usefulness of Web SQL was that it provided a way for web developers to store data on the client side, allowing for improved website performance and offline functionality. However, it has been deprecated in favor of IndexedDB due to concerns about cross-browser compatibility and security.

These were some of the Web Storage Mechanisms, that help us in retaining/storing data on the client side. These help in a lot of business logic as well as user experience improvements. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Web Basics - Part 3

Saptarshi Bhattacharya — Tue, 14 Mar 2023 13:58:08 GMT

Basic Architecture of the Web

The basic architecture of the web is a distributed client-server architecture, where the client sends requests to a server and the server sends responses back to the client. The client is typically a web browser or other application that requests information or services from a server. The server is a computer program or machine that provides the requested information or services to the client.

The client-server architecture is built on top of the TCP/IP protocol stack, which consists of a set of protocols that define how computers communicate over a network. At the application layer, the most commonly used protocols in the client-server architecture are HTTP (HyperText Transfer Protocol), FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol), and DNS (Domain Name System).

HTTP is the protocol that defines how web pages are transferred between clients and servers. It is a request-response protocol, where the client sends a request to the server and the server sends a response back to the client. FTP is the protocol that defines how files are transferred between clients and servers. SMTP is the protocol that defines how email messages are transferred between clients and servers.

Role and Anatomy of IP Addresses

An IP address is a unique numerical identifier that is assigned to devices connected to the internet. It stands for Internet Protocol address and is used to route data packets from one device to another.

There are two types of IP addresses: IPv4 and IPv6. IPv4 is a 32-bit address and can support up to 4.3 billion unique addresses. IPv6, on the other hand, is a 128-bit address and can support significantly more unique addresses. IPv6 was developed to address the shortage of IPv4 addresses.

IP addresses are used to identify devices on the internet. When a device connects to the internet, it is assigned an IP address. This IP address is used to route data packets from one device to another. When you enter a website's domain name into your browser, the browser uses DNS to translate that domain name into an IP address. The IP address is then used to establish a connection between your device and the website's server, allowing data to be transmitted back and forth.

An IP address consists of two parts: the network address and the host address. The network address is used to identify the network that the device belongs to, while the host address identifies the specific device on that network. The division between the network address and host address is determined by the subnet mask.

In IPv4, the first few bits of the address represent the network address, while the remaining bits represent the host address. The subnet mask is used to determine the number of bits used for the network address and the host address. For example, a subnet mask of 255.255.255.0 indicates that the first three octets of the IP address represent the network address, and the last octet represents the host address.

In IPv6, the first 64 bits of the address represent the network address, while the remaining 64 bits represent the host address. The division between the network address and host address is fixed and determined by the structure of the IPv6 address.

In summary, IP addresses are a crucial component of the internet as they allow devices to communicate with each other. They are used to identify devices on the internet and route data packets from one device to another.

How does DNS work?

DNS (Domain Name System) is a hierarchical system that translates domain names into IP (Internet Protocol) addresses and vice versa.

When a user types a domain name in their web browser, the computer sends a request to a DNS resolver to resolve the domain name into an IP address. The resolver first checks its cache to see if it has a record of the IP address for the domain name. If the resolver does not have a record, it sends a query to the root DNS server, which response with a referral to the appropriate Top-Level Domain (TLD) DNS server for that domain.

The TLD DNS server then responds with a referral to the authoritative DNS server for the domain, which has the IP address information for the domain. The resolver then caches the IP address information and returns it to the user's computer, allowing the web browser to connect to the IP address and load the website associated with the domain name.

Conversely, when a user enters an IP address in their web browser, the DNS resolver performs a reverse DNS lookup to translate the IP address into a domain name. The process is similar to the forward DNS lookup, but it involves querying different DNS servers and databases to find the domain name associated with the IP address.

Overall, the DNS system is crucial for navigating the internet and accessing websites through domain names rather than memorizing IP addresses.

Let's go through an example. Thank you Codedamn for explaining this to me. Check out Codedamn for great content.

The steps followed are:

Our ISP already provides a DNS resolver or we do. It is a hard-coded IP address that maintains a kind-of look-up table for domains to their IP address.
On getting a request, our OS asks the DNS's IP what is the IP address of the website we are looking for. In this case "codedamn.com".
A request chain ensues where 1.1.1.1 (our IP) asks another server, for example, 65.24.11.22 where the IP address of all ".com" domains are stored. He replies with the required IP of the server. (77.22.11.2)
We do not get the IP of "codedamn.com" from 77.22.11.2. But it points us towards 44.11.232.55
We finally get the required IP from 77.22.11.2.

These domain-to-IP conversions are cached so that the entire request chain is not executed again.

Web Servers

Web servers are software programs that run on a server computer and handle HTTP requests and responses for web pages and web applications. When a user requests a web page or application from a web server, the server responds with the requested content, which is then displayed in the user's web browser.

Apache and Nginx are two of the most popular web servers in use today. Here's a brief overview of how they work:

Apache: Apache is an open-source web server that has been around since the mid-1990s. It uses a multi-process architecture where each incoming connection is handled by a separate process or thread. When a user makes an HTTP request, Apache receives the request and passes it to a worker process. The worker process then handles the request, generates the response, and sends it back to the user's web browser.

Apache is highly configurable and supports a wide range of modules that can be used to add functionality to the server. This makes it a popular choice for hosting dynamic websites and applications.

Nginx: Nginx (pronounced "engine-x") is a lightweight, high-performance web server that was created in 2004. It uses an event-driven, non-blocking architecture that allows it to handle a large number of simultaneous connections without using a lot of system resources. When a user makes an HTTP request, Nginx receives the request and passes it to a worker process or thread. The worker process then handles the request, generates the response, and sends it back to the user's web browser.

Nginx is often used as a reverse proxy, which means it sits in front of other web servers and distributes incoming requests to those servers. This can help improve the performance and scalability of a web application.

Both Apache and Nginx are powerful web servers with their own strengths and weaknesses. Choosing between them often depends on the specific needs of a website or application, as well as the preferences of the web administrator.

NOTE: You don't need to know about either of these servers to get into the backend development series. But basic knowledge is always a plus.

Sub-Net Masks

Subnet masks are a way of dividing a network into smaller subnetworks, or subnets. A subnet mask is a 32-bit binary number that is used to identify the network portion and the host portion of an IP address.

In IPv4, an IP address consists of 32 bits, divided into four 8-bit octets. Each octet represents a number between 0 and 255, and is separated by a period. For example, the IP address 192.168.0.1 is represented as 11000000.10101000.00000000.00000001 in binary.

A subnet mask is also a 32-bit binary number, where the network portion of the IP address is represented by a string of ones, and the host portion is represented by a string of zeros. For example, a subnet mask of 255.255.255.0 is represented as 11111111.11111111.11111111.00000000 in binary.

The subnet mask is used to determine which part of an IP address represents the network and which part represents the host. The network portion of an IP address is used to identify the network that the host belongs to, while the host portion is used to identify the individual host within the network.

By using subnet masks, a network can be divided into smaller subnets, each with its own network and host portion. This allows for more efficient use of IP addresses and can improve network performance and security.

In CIDR (Classless Inter-Domain Routing) notation, the subnet mask is specified by indicating the number of bits used for the network portion of the IP address. In the case of /25, which means that the first 25 bits of the IP address are used to identify the network portion, and the remaining 7 bits are used to identify the host portion.

For example, if a network has the IP address range 192.168.0.0/24 and a subnet mask of 255.255.255.0, it can be divided into smaller subnets with their own unique network portion and host portion. A subnet with the IP address range 192.168.0.0/25 would have a subnet mask of 255.255.255.128 and can support up to 126 hosts, while a subnet with the IP address range 192.168.0.128/25 would also have a subnet mask of 255.255.255.128.

Security Considerations

Security considerations are crucial when it comes to both servers and clients. There are several key measures that can be taken to ensure the security of these systems, including the use of Firewalls, SSL/TLS, and HTTPS.

In computing, a Firewall is a network security system that monitors and controls the incoming and outgoing network traffic based on predetermined security rules. A firewall typically establishes a barrier between a trusted network and an untrusted network, such as the Internet. At its most basic, a firewall is essentially the barrier that sits between a private internal network and the public Internet. A firewalls main purpose is to allow non-threatening traffic in and to keep dangerous traffic out.
SSL/TLS is another important security measure that is commonly used in both servers and clients. SSL (Secure Sockets Layer) and its successor, TLS (Transport Layer Security), are cryptographic protocols that provide secure communication over the internet. SSL/TLS is commonly used to secure web traffic, email, and other types of online communications. SSL/TLS works by encrypting data transmitted between a server and a client, making it difficult for unauthorized users to intercept and read the data.
HTTPS is a secure version of the HTTP protocol used to transfer data over the internet. It uses SSL/TLS to encrypt the data transmitted between a server and a client, providing an additional layer of security to web traffic. HTTPS is commonly used to protect sensitive data, such as credit card numbers, passwords, and other personal information.

It is essential to keep these measures up to date and regularly review and update security protocols to maintain the highest level of protection.

Challenges in Scalability

Scalability is a crucial consideration for web servers that experience high levels of traffic. As web traffic increases, web servers may struggle to keep up with demand, which can lead to slow page load times, downtime, and other issues. There are several challenges that need to be addressed to achieve scalability in web servers, including load balancing, caching and clustering.

Load balancing is a technique that distributes incoming web traffic across multiple servers to avoid overloading any single server. Load balancing can be achieved through various methods, such as round-robin or weighted round-robin, IP hash, or least connections. Load balancing helps to ensure that incoming web traffic is evenly distributed among servers, reducing the load on any single server.

Caching is another technique that can be used to improve the scalability of web servers. Caching involves storing frequently accessed data in memory or on disk, which can significantly reduce the load on the server. Caching can be implemented at various levels, such as database caching, object caching, or page caching.

Clustering is a technique that involves grouping multiple servers together to act as a single system. Clustering can improve scalability by distributing the load among multiple servers, increasing the capacity of the system. Clustering also provides redundancy, which can help to ensure that the system remains available in the event of a server failure.

Load Balancing

Load balancing is the process of distributing workloads or traffic evenly across multiple resources, such as servers or network links, in order to optimize resource utilization, increase performance, and improve the availability and reliability of the system.

Load balancing is needed because as a system grows, it can become overwhelmed by requests or traffic, causing slowdowns or failures. By distributing the workload across multiple resources, load balancing ensures that no single resource becomes overwhelmed, thereby improving the overall performance and availability of the system.

Load balancing is helpful in many ways, including:

Scalability: Load balancing allows a system to scale out by adding more resources to handle increasing demand, rather than scaling up by adding more capacity to a single resource.
Fault tolerance: Load balancing ensures that if one resource fails, traffic can be automatically rerouted to another resource to avoid downtime or service interruption.
Performance: Load balancing can improve the performance of a system by evenly distributing traffic across multiple resources, which can help reduce response times and increase throughput.

There are different load-balancing algorithms that can be used to determine how traffic is distributed among resources. These include:

Round-robin: Traffic is distributed in a cyclical manner among the available resources.
Least connections: Traffic is sent to the resource with the fewest active connections.
IP hash: Traffic is distributed based on the source or destination IP address.

Load balancing can be implemented at different layers of the network stack, including the application layer, transport layer, and network layer. Application-level load balancing involves distributing traffic based on application-specific criteria, such as HTTP headers or cookies. Transport-level load balancing involves distributing traffic based on transport-layer protocols, such as TCP or UDP. Network-level load balancing involves distributing traffic based on network-layer protocols, such as IP addresses or routing information.

Overall, load balancing is a key technique for improving the performance, availability, and scalability of modern computer systems, and it plays an important role in ensuring that critical applications and services remain available and responsive to users.

Content Delivery Networks (CDN)

A CDN (Content Delivery Network) is a system of distributed servers that deliver web content to users based on their geographic location. The goal of a CDN is to reduce latency, improve page load times, and increase the availability of web content.

When a user requests a piece of content, such as an image or video, the request is routed to the nearest CDN server, which is usually the one with the lowest latency or closest geographic proximity to the user. The CDN server then serves the content to the user, bypassing the need for the request to travel back to the origin server where the content is hosted.

CDNs can improve performance in several ways:

Reduced latency: By serving content from a nearby server, CDNs can reduce the time it takes for content to travel from the origin server to the user's device, reducing latency and improving page load times.
Improved availability: CDNs can help ensure that content is always available by replicating it across multiple servers. If one server fails or becomes overloaded, the request can be automatically routed to another server.
Reduced network congestion: By serving content from a nearby server, CDNs can reduce the amount of traffic that needs to travel over long distances, which can help reduce network congestion and improve overall network performance.
Caching: CDNs can cache frequently accessed content on edge servers, which can reduce the load on origin servers and improve page load times for subsequent requests.
Security: CDNs can provide security features such as DDoS protection and SSL encryption to help protect against attacks and improve the security of web content.

Overall, CDNs are an important tool for improving the performance and availability of web content, particularly for websites or applications with a global user base. By leveraging a distributed network of servers, CDNs can help reduce latency, improve availability, and enhance overall user experience.

That was most of the basics cleared out of the way before we move on to the backend dev blog. We have to touch on a few bits and bobs, which we will take care of in the next video. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Web Basics - Part 2

Saptarshi Bhattacharya — Sun, 12 Mar 2023 21:12:26 GMT

Introduction to Protocols

Protocols are essential to modern communication, enabling devices to exchange information over networks in a standardized and reliable manner. Put simply, a protocol is a set of rules that governs how data is transmitted and received between different devices on a network. These rules ensure that data is transmitted correctly and that devices are able to understand and interpret the data they receive.

The need for protocols arose as computer networks became more widespread, with different types of devices and software needing to communicate with each other over a common network. After the invention of the ARPANet, multiple networks were created, leading to compatibility issues and communication problems between different devices. This problem was addressed by the development of standardized protocols that could be used across different networks and devices.

Today, there are numerous networking protocols in use, ranging from the well-known HTTP protocol used for transmitting data over the web, to the complex TCP/IP protocol suite that forms the backbone of the internet. Understanding these protocols is essential for anyone who works with computers or computer networks, as it allows them to troubleshoot problems, optimize network performance, and ensure that their systems are secure and reliable.

Types of Internet Protocols

There are multiple Internet Protocols. Each of them is described in detail here:

TCP/IP is a protocol suite that provides the fundamental communication protocols for the internet and most modern computer networks. It was developed in the 1970s by the United States Department of Defense, and it is now widely used in both public and private networks. These are a set of standard rules that allows different types of computers to communicate with each other. The IP protocol ensures that each computer that is connected to the Internet is having a specific serial number called the IP address. TCP specifies how data is exchanged over the internet and how it should be broken into IP packets. It also makes sure that the packets have information about the source of the message data, the destination of the message data, the sequence in which the message data should be re-assembled, and checks if the message has been sent correctly to the specific destination. The TCP is also known as a connection-oriented protocol. TCP/IP is composed of four layers, each of which performs a specific set of functions:
1. Application layer: The application layer is the top layer of the TCP/IP protocol stack. It is responsible for handling the specific protocols used by applications, such as HTTP for web browsing, SMTP for email, and FTP for file transfers.
2. Transport layer: The transport layer is responsible for ensuring that data is transmitted reliably and accurately between devices. This layer uses two protocols: TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). TCP is a connection-oriented protocol that provides reliable transmission of data, while UDP is connectionless and is used for applications that require the fast and lightweight transmission of data.
3. Internet layer: The internet layer is responsible for addressing and routing data across networks. This layer uses the Internet Protocol (IP) to assign unique addresses to devices and to determine the best path for data to travel from the source to the destination.
4. Link layer: The link layer is the lowest layer of the TCP/IP protocol stack. It is responsible for transmitting data over physical media, such as Ethernet or Wi-Fi. This layer is responsible for data framing, error detection, and flow control.

Together, these four layers of the TCP/IP protocol suite provide a standardized set of protocols that allow devices to communicate with each other over networks, regardless of their underlying hardware or software.

HTTP (Hypertext Transfer Protocol) is a protocol used for transmitting data over the World Wide Web. It is a part of the application layer of the TCP/IP protocol suite. HTTP is used by web browsers, such as Google Chrome and Mozilla Firefox, to access web pages on the internet. HTTP works by establishing a connection between the user's computer and the web server hosting the web page, after which the server sends the web page data to the user's computer. HTTP supports a variety of data types, including text, images, and video. This protocol defines how the information needs to be formatted and transmitted. And, it also defines the various actions the web browsers should take in response to the calls made to access a particular web page.
FTP (File Transfer Protocol) is a protocol used for transferring files between two devices over a network. It is a part of the application layer of the TCP/IP protocol suite. FTP is used by users to upload and download files from a remote server. FTP works by establishing a connection between the user's computer and the remote server, after which the user can transfer files to and from the server. When a machine requests for file transfer from another machine, the FTO sets up a connection between the two and authenticates each other using their ID and Password. And, the desired file transfer takes place between the machines.
SMTP (Simple Mail Transfer Protocol) is a protocol used for sending and receiving email messages over the internet. It is a part of the application layer of the TCP/IP protocol suite. SMTP is used by email clients, such as Microsoft Outlook, to send emails to an SMTP server, which then forwards the email to its intended recipient. SMTP works by establishing a connection between the email client and the SMTP server, after which the client sends the email message to the server, which then relays the message to the recipient's email server. This protocol uses the header of the mail to get the email id of the receiver and enters the mail into the queue of outgoing mail. And as soon as, it delivers the mail to the receiving email id, it removes the email from the outgoing list. The message or the electronic mail may consider the text, video, image, etc. It helps in setting up some communication server rules.
SFTP(Secure File Transfer Protocol): SFTP which is also known as SSH FTP refers to File Transfer Protocol (FTP) over Secure Shell (SSH) as it encrypts both commands and data while in transmission. SFTP acts as an extension to SSH and encrypts files and data then sends them over a secure shell data stream. This protocol is used to remotely connect to other systems while executing commands from the command line.
HTTPS(HyperText Transfer Protocol Secure): HTTPS is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network with the SSL/TLS protocol for encryption and authentication. So, generally, a website has an HTTP protocol but if the website is such that it receives some sensitive information such as credit card details, debit card details, OTP, etc then it requires an SSL certificate installed to make the website more secure. So, before entering any sensitive information on a website, we should check if the link is HTTPS or not. If it is not HTTPS then it may not be secure enough to enter sensitive information.
SFTP(Secure File Transfer Protocol): SFTP which is also known as SSH FTP refers to File Transfer Protocol (FTP) over Secure Shell (SSH) as it encrypts both commands and data while in transmission. SFTP acts as an extension to SSH and encrypts files and data then sends them over a secure shell data stream. This protocol is used to remotely connect to other systems while executing commands from the command line.
ICMP (Internet Control Message Protocol) is a network protocol that is used to send error messages and operational information about network conditions. It is an integral part of the Internet Protocol (IP) suite and is used to help diagnose and troubleshoot issues with network connectivity. ICMP messages are typically generated by network devices, such as routers, in response to errors or exceptional conditions encountered in forwarding a datagram. Some examples of ICMP messages include:
- Echo Request and Echo Reply (ping)
- Destination Unreachable
- Time Exceeded
- Redirect

ICMP can also be used by network management tools to test the reachability of a host and measure the round-trip time for packets to travel from the source to the destination and back. It should be noted that ICMP is not a secure protocol, it can be used in some types of network attacks like DDoS amplification.

UDP (User Datagram Protocol) is a connectionless, unreliable transport layer protocol. Unlike TCP, it does not establish a reliable connection between devices before transmitting data, and it does not guarantee that data packets will be received in the order they were sent or that they will be received at all. Instead, UDP simply sends packets of data to a destination without any error checking or flow control. UDP is typically used for real-time applications such as streaming video and audio, online gaming, and VoIP (Voice over Internet Protocol) where a small amount of lost data is acceptable and low latency is important. UDP is faster than TCP because it has less overhead. It doesnt need to establish a connection, so it can send data packets immediately. It also doesnt need to wait for confirmation that the data was received before sending more, so it can transmit data at a higher rate.
IMAP (Internet Message Access Protocol) is a protocol used for retrieving emails from a mail server. It allows users to access and manage their emails on the server, rather than downloading them to a local device. This means that the user can access their emails from multiple devices and the emails will be synced across all devices. IMAP is more flexible than POP3 (Post Office Protocol version 3) as it allows users to access and organize their emails on the server, and also allows multiple users to access the same mailbox.

NOTE: An SSL (Secure Sockets Layer) certificate is a digital certificate that encrypts and authenticates data transmission over the internet. It is used to establish a secure connection between a web server and a web browser or other client software, ensuring that all data transferred between the two is encrypted and cannot be intercepted or tampered with by unauthorized third parties.

SSL certificates are issued by trusted third-party Certificate Authorities (CAs) after verifying the identity of the website owner and ensuring that they have the legal right to use the domain name associated with the website. Once installed on the web server, the SSL certificate activates the HTTPS protocol, which adds an additional layer of security to the standard HTTP protocol used for web communication.

When a user accesses a website with an SSL certificate, their web browser checks the certificate to ensure that it is valid and issued by a trusted CA. If the certificate is valid, the browser initiates a secure session with the website, which encrypts all data transmitted between the two parties using a cryptographic key.

SSL certificates are essential for protecting sensitive information, such as personal data, login credentials, and financial transactions, from interception or theft by hackers or other malicious actors. They are widely used by e-commerce sites, banks, healthcare providers, and other organizations that handle sensitive information online.

Are all these protocols children of TCP/IP?

Not all of the protocols I mentioned are directly related to TCP/IP, but many of them are used in conjunction with TCP/IP. For example, DNS, DHCP, SNMP, and SSH all operate at the network layer or transport layer of the TCP/IP protocol stack. SMTP, POP3, and IMAP operate at the application layer, which is the top layer of the TCP/IP protocol stack. RTP and RTSP are often used in conjunction with TCP/IP for streaming media over the internet. SIP is also an application layer protocol that is used for voice and video communication over IP networks. So while these protocols may not all be direct descendants of TCP/IP, they are often used together with TCP/IP to enable communication over computer networks and the internet.

NOTE: Not all the Protocols are mentioned. Only the important ones are listed above. Basic Knowledge about these protocols is enough to get started with Web Dev.

Some other important protocols.

IPv4 and IPv6: IPv4 (Internet Protocol version 4) and IPv6 (Internet Protocol version 6) are two different versions of the Internet Protocol, which is the fundamental protocol that is used for communication over the internet. IPv4 is the older of the two protocols and has been in use since the early days of the internet. It uses 32-bit addresses and is capable of supporting up to about 4.3 billion unique addresses. However, with the explosion of internet-connected devices in recent years, the available pool of IPv4 addresses has been rapidly depleted, leading to the adoption of IPv6. IPv6 uses 128-bit addresses, which allows for an almost unlimited number of unique addresses (about 340 undecillion, which is a number with 38 zeros). This is enough to provide every device on the planet with a unique address and allow for the continued growth of the internet. IPv6 also includes several other improvements over IPv4, including better support for quality of service (QoS) and security, as well as simpler address allocation and configuration. While IPv6 has been available for many years, the adoption of the new protocol has been slow due to the need to upgrade existing network infrastructure and devices to support the new standard. However, as the pool of available IPv4 addresses continues to shrink, the need for widespread adoption of IPv6 is becoming more urgent.
SSH: SSH (Secure Shell) is a protocol used for secure remote login and other secure network services. It provides a secure and encrypted way to remotely access and manage servers, network devices, and other computer systems. SSH uses public-key cryptography to authenticate the user and encrypt the data being transmitted, making it much more secure than traditional remote login protocols such as Telnet. SSH also allows for secure file transfers using the SCP (Secure Copy) and SFTP (Secure File Transfer Protocol) protocols. It is widely used in Unix-based operating systems and is also available for Windows. It is commonly used by system administrators, developers, and other technical users to remotely access and manage servers and other network devices.

We will talk more about IPs, Domains, Subnet Masks, and other related concepts in the next blog. This was just a packed blog containing all the basic theory stuff as a refresher. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Web Basics - Part 1

Saptarshi Bhattacharya — Sun, 12 Mar 2023 13:40:37 GMT

The History of the Web

The internet started as a research project in the late 1960s by the United States Department of Defense's Advanced Research Projects Agency (ARPA). The goal was to create a network that would allow researchers at different universities and institutions to communicate and share information more efficiently.

The initial version of the internet, called ARPANET, was created in 1969 and connected four universities in the United States. It used packet-switching technology to transmit data between computers, which allowed information to be broken down into small packets and sent across the network. This was a significant improvement over earlier communication systems that used dedicated point-to-point connections.

Over the next few decades, the internet grew and evolved rapidly. In the 1980s, the development of the World Wide Web (WWW) by Tim Berners-Lee at CERN in Switzerland allowed users to access and share information using hypertext links. This made the internet much more user-friendly and accessible to a broader audience.

In the 1990s, the commercialization of the internet began, and companies started building websites and offering online services to consumers. The development of web browsers and search engines made it easier for users to find and navigate the web.

Today, the internet is a vast network of interconnected computers and servers, and it's an essential part of modern life. From email and social media to online shopping and streaming video, the internet has transformed the way we communicate, work, and interact with the world around us.

The arrival of TCP/IP

After the emergence of individual networks like ARPANET, one of the main problems was that these networks were using different protocols and technologies. This made it difficult to connect with them and share information with them. For example, a computer on one network might not be able to communicate with a computer on another network because they were using different communication protocols.

To solve this problem, the TCP/IP protocol was developed in the 1970s. TCP/IP stands for Transmission Control Protocol/Internet Protocol, and it's a set of standards for transmitting data over networks, including the internet. TCP is responsible for breaking data into packets, reassembling them at the destination, and ensuring that they arrive in the correct order. IP, on the other hand, is responsible for addressing and routing packets across the network.

By using a common set of standards, TCP/IP made it possible to connect different networks and communicate between them. This laid the foundation for the internet as we know it today.

The birth of the World Wide Web was another significant development in the history of the internet. In 1989, Tim Berners-Lee, a computer scientist at CERN in Switzerland, proposed a new way of sharing and accessing the information on the internet using hypertext links. This was the beginning of the World Wide Web.

Berners-Lee developed three key technologies to make the web work: HTML (Hypertext Markup Language), which is used to create web pages; HTTP (Hypertext Transfer Protocol), which is used to transfer data between web servers and clients; and URLs (Uniform Resource Locators), which are used to identify and locate web pages on the internet.

These technologies made it possible to create and share information on the web in a way that was easy to use and accessible to a broad audience. The web quickly grew in popularity and became an essential part of the internet, leading to the development of new technologies and applications that continue to shape the way we use and interact with the internet today.

Contributions of Sir Tim Berners Lee

Tim Berners-Lee is the inventor of the World Wide Web, which revolutionized the way we share and access information on the internet. Berners-Lee developed three key technologies to make the web work:

HTML (Hypertext Markup Language): HTML is a markup language used to create web pages. It provides a standardized way of structuring content on the web, using tags and attributes to define headings, paragraphs, images, links, and other elements of a web page. HTML allows web developers to create rich, interactive content that can be viewed and accessed by anyone with an internet connection.
HTTP (Hypertext Transfer Protocol): HTTP is the protocol used to transfer data between web servers and clients. It enables clients (such as web browsers) to request web pages and other resources from servers, and it allows servers to respond with the requested data. HTTP is a stateless protocol, which means that each request and response is independent of any previous requests or responses. This allows for faster and more efficient communication between clients and servers.
URLs (Uniform Resource Locators): URLs are used to identify and locate resources on the web, such as web pages, images, and other files. A URL consists of several parts, including the protocol (HTTP or HTTPS), the domain name of the server, and the path to the resource on the server. URLs make it easy for users to navigate the web and access the content they're looking for.

Together, these technologies provide the foundation for the World Wide Web and enable users to create, share, and access information on the internet. They have played a crucial role in shaping the modern internet and continue to evolve and improve as technology advances.

Web Trilogy

The transition from Web1 to Web2 to Web3 represents a shift in the way the internet is used and accessed, as well as the technologies that underpin it. Here is a brief overview of each phase:

Web1: The first phase of the web, also known as the static web, was characterized by static HTML pages that provided basic information and limited interaction. Users could browse web pages but could not interact with them beyond clicking on links.
Web2: The second phase of the web, also known as the social web, saw the emergence of dynamic and interactive web pages that enabled user-generated content, social networking, and e-commerce. Web2 also saw the rise of mobile devices and the use of cloud computing to provide scalable and flexible services.
Web3: The third phase of the web, also known as the decentralized web or web3, is characterized by a move towards decentralization, blockchain technology, and the use of cryptocurrencies. Web3 aims to provide users with greater control over their data and digital identities and to create decentralized applications that are more secure and resilient.

Centralization vs decentralization is a key theme in the evolution of the web. Centralization refers to the concentration of power and control in the hands of a few large organizations, such as social media giants like Facebook or Twitter. Decentralization, on the other hand, involves distributing power and control across a network of users, creating a more democratic and resilient system.

Early examples of decentralization include Napster, a peer-to-peer file-sharing service that allowed users to share music files directly with each other, and BitTorrent, a decentralized file-sharing protocol that allowed users to download and share files without relying on a centralized server. These technologies represented a shift away from traditional client-server architectures and towards a more distributed and peer-to-peer approach.

Overall, the transition from Web1 to Web2 to Web3 represents an ongoing evolution of the web, driven by changing user needs and technological advances.

NOTE: This is just a short blog on the history of the web. This is just an introduction blog and some theory before the rest of the Web Basics blog and the Backend Development Blog.

Evolution of Web Technologies

The evolution of web technologies has been a key driver in the development of the web over the past several decades. Here is a brief overview of some of the key technologies that have shaped the web:

HTML: Hypertext Markup Language (HTML) is a markup language used to create web pages. The first version of HTML was introduced in 1991, and it has since evolved through several versions, with HTML5 being the most recent. HTML is the foundation of the web, providing the structure and content of web pages.
CSS: Cascading Style Sheets (CSS) is a language used to define the visual style of web pages, including layout, fonts, colors, and more. CSS was introduced in 1996 as a way to separate the presentation of web pages from their content, allowing for greater flexibility and control.
JavaScript: JavaScript is a programming language used to add interactivity and functionality to web pages. It was introduced in 1995 and has since become one of the most widely used programming languages in the world. JavaScript allows developers to create dynamic and responsive web pages, as well as to build complex web applications.
XML: Extensible Markup Language (XML) is a markup language used to describe data and its structure. It was introduced in 1998 as a way to provide a standardized format for exchanging data over the web. XML is often used in web services and APIs to exchange data between different systems.
AJAX: Asynchronous JavaScript and XML (AJAX) is a technique used to create fast and dynamic web applications. It allows web pages to update content without requiring a full page reload, making web applications feel more like desktop applications. AJAX was introduced in 2005 and has since become a popular technique for building web applications.

These are just a few examples of the many technologies that have shaped the evolution of the web over the years. As technology continues to evolve, we can expect to see even more innovations in the years to come. You will get to see a lot more technologies than the ones mentioned here in the blogs coming ahead.

If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Getting into Node.js

Saptarshi Bhattacharya — Tue, 07 Mar 2023 14:37:03 GMT

Node.js is an open-source, cross-platform, back-end JavaScript runtime environment that is built on Chrome's V8 JavaScript engine. It was created by Ryan Dahl in 2009 and is maintained by the Node.js Foundation. Node.js is used to build fast, scalable network applications and web servers, and its popularity has been growing rapidly since its inception.

Node.js was created to address the need for a non-blocking, event-driven I/O model that can handle large amounts of data and connections with high performance. Node.js uses an event loop that listens for incoming requests and responds to them asynchronously, which makes it ideal for handling real-time, data-intensive applications.

Node.js is built on top of Google's V8 JavaScript engine, which compiles JavaScript code into native machine code that can be executed directly by the computer's processor. This means that Node.js can execute JavaScript code much faster than other interpreted languages like PHP or Ruby.

NOTE: Nodejs is not a framework. It is a runtime environment that is the basis for other frameworks like Express.

Features

Some features of Nodejs are:

Asynchronous and event-driven: Node.js is built on an event-driven, non-blocking I/O model that makes it lightweight and efficient. (more on this later)
Cross-platform: Node.js is built to work on a variety of platforms, including Windows, Mac OS X, and Linux.
Fast: Node.js is built on the V8 JavaScript engine, which is known for its high performance and speed.
Scalable: Node.js is designed to handle large-scale applications with ease, thanks to its non-blocking I/O model and event-driven architecture.
Extensible: Node.js has a large ecosystem of modules and libraries that can be easily installed and used in your applications.
Server-side scripting: Node.js is often used for server-side scripting, allowing developers to build powerful server-side applications using JavaScript.
Easy to learn: With its simple syntax and extensive documentation, Node.js is easy to learn for developers who are familiar with JavaScript.
Community-driven: Node.js has a large and active community of developers, who are constantly contributing to the development of new tools, modules, and libraries.
Open source: Node.js is an open-source platform, which means that anyone can contribute to its development and use it for free.

Nodejs Architecture

Let us have a look at the Nodejs architecture, which will help us understand quite a lot of its features.

Node.js architecture is based on an event-driven, non-blocking I/O model that makes it highly efficient and scalable. The architecture consists of several components:

V8 Engine: Node.js uses the Google V8 JavaScript engine to execute JavaScript code. This engine compiles JavaScript into native machine code that can be executed directly by the CPU, making it faster than traditional interpreters.
Libuv: Node.js uses the Libuv library to provide an event loop and other asynchronous I/O capabilities. This library is also responsible for handling file system operations, networking, and other system-level functions.
Node.js APIs: Node.js provides a number of built-in APIs for interacting with the file system, networking, and other system-level functions. These APIs are written in C/C++ and are exposed to JavaScript via bindings.
Node.js Runtime: The Node.js runtime is the environment in which Node.js applications run. It provides a JavaScript execution environment, as well as access to system resources such as the file system, network, and operating system.
Operating System: Node.js applications run on top of an operating system, such as Linux, Windows, or macOS. The operating system provides access to hardware resources, such as the CPU, memory, and network interfaces.

Node.js uses an event loop to handle incoming requests and execute non-blocking I/O operations. The event loop is a loop that constantly checks for events and executes the associated callback functions. When a new request comes in, it is added to the event queue, and when the event loop reaches that event, it executes the associated callback function. This means that Node.js can handle many requests simultaneously, without blocking the execution of other requests.

Node.js achieves this by offloading time-consuming I/O operations, such as reading or writing to the file system or making network requests, to a separate thread pool. When an I/O operation is requested, Node.js adds it to a queue and sends it to the thread pool, freeing up the main thread to continue processing other requests. When the operation is complete, the thread pool returns the result to the main thread, which then executes the callback function.

Although Node.js uses a thread pool to offload I/O operations, it is still considered single-threaded because all user code runs on a single thread, and the thread pool is managed internally by Node.js. Additionally, because Node.js uses non-blocking I/O operations, the thread pool is only used for time-consuming tasks, and most operations can be handled by the main event loop.

https://www.youtube.com/watch?v=6YgsqXlUoTM

Check out this video if you want to know about the event loop in detail. Or if you have half an hour you can go for the one provided below.

https://www.youtube.com/watch?v=8aGhZQkoFbQ

What are Events?

Events in Node.js are a mechanism used to handle and respond to actions or signals that occur asynchronously in a program. In Node.js, the EventEmitter class is used to implement the event-driven architecture. The EventEmitter allows developers to create custom events and handle them with corresponding listeners.

Events in Node.js work on a publisher-subscriber model, where publishers emit events and subscribers listen to those events. When an event is emitted, all subscribers to that event are notified and can respond accordingly. This makes it possible for Node.js applications to be highly responsive to user input and other events that occur in the system.

Node.js has a built-in events module that provides the EventEmitter class and other related utilities for working with events. The events module can be used to implement complex event-driven systems such as web servers, real-time chat applications, and other systems that need to handle multiple concurrent connections.

The Non-Blocking model

This is one of the features of the Nodejs architecture. Let us first understand what blocking code vs non-blocking code is from GeeksforGeeks.

Blocking: It refers to the blocking of further operations until the current operation finishes. Blocking methods are executed synchronously. Synchronously means that the program is executed line by line. The program waits until the called function or the operation returns.

Example: The following example uses the readFileSync() function to read files and demonstrate Blocking in Node.js

const fs = require('fs');const filepath = 'text.txt';// Reads a file in a synchronous and blocking wayconst data = fs.readFileSync(filepath, {encoding: 'utf8'});// Prints the content of fileconsole.log(data);// This section calculates the sum of numbers from 1 to 10let sum = 0;for(let i=1; i<=10; i++){    sum = sum + i;}// Prints the sumconsole.log('Sum: ', sum);

On running the index.js file use the following command:

node index.js

Output:

This is from text file.Sum:  55

Non-Blocking: It refers to the program that does not block the execution of further operations. Non-Blocking methods are executed asynchronously. Asynchronously means that the program may not necessarily execute line by line. The program calls the function and moves to the next operation and does not wait for it to return.

Example: The following example uses the readFile() function to read files and demonstrate Non-Blocking in Node.js

Run the index.js file using the following command:

const fs = require('fs');const filepath = 'text.txt';// Reads a file in a asynchronous and non-blocking wayfs.readFile(filepath, {encoding: 'utf8'}, (err, data) => {    // Prints the content of file    console.log(data);});// This section calculates the sum of numbers from 1 to 10let sum = 0;for(let i=1; i<=10; i++){    sum = sum + i;}// Prints the sumconsole.log('Sum: ', sum);

On running the index.js file use the following command:

node index.js

Output:

Sum:  55This is from text file.

In the non-blocking program, the sum actually prints before the content of the file. This is because the program does not wait for the readFile() function to return and move to the next operation. And when the readFile() function returns it prints the content.

How does Nodejs handle this?

Node.js uses an event-driven, non-blocking I/O model to handle both blocking and non-blocking code. When a blocking operation is called, Node.js delegates the operation to the system kernel, which frees up the event loop to handle other tasks. Once the operation is complete, Node.js will handle the callback function in the event loop. This allows Node.js to execute other code while waiting for the operation to complete.

On the other hand, non-blocking code is executed immediately, and the result is passed to the callback function when the operation is complete. This allows Node.js to continue executing other code while waiting for the non-blocking operation to complete.

Node.js also has a thread pool that is used for some built-in modules like crypto, zlib, and fs module synchronous functions. These functions are executed outside of the main event loop to prevent blocking the execution of other code.

In summary, Node.js handles blocking and non-blocking code by delegating blocking operations to the system kernel and executing non-blocking code immediately while passing the result to a callback function. The thread pool is used to offload blocking synchronous functions from the main event loop to prevent blocking other code.

A step-by-step explanation.

Thanks to Andrew Mead for explaining this in his course.

Push main() onto the call stack.
Push console.log() onto the call stack. This then runs right away and gets popped.
Push setTimeout(2000) onto the stack. setTimeout(2000) is a Node API. When we call it, we register the event-callback pair. The event will wait 2000 milliseconds, then the callback is the function.
After registering it in the APIs, setTimeout(2000) gets popped from the call stack.
Now the second setTimeout(0) gets registered in the same way. We now have two Node APIs waiting to execute.
After waiting for 0 seconds, setTimeout(0) gets moved to the callback queue, and the same thing happens with setTimeout(2000).
In the callback queue, the functions wait for the call stack to be empty because only one statement can execute at a time. This is taken care of by the event loop.
The last console.log() runs and the main() gets popped from the call stack.
The event loop sees that the call stack is empty and the callback queue is not empty. So it moves the callbacks (in a first-in-first-out order) to the call stack for execution.

How is Nodejs different from normal Javascript?

JavaScript (JS) and Node.js are both based on the same programming language, but they have some key differences:

Environment: JS runs in a browser, while Node.js runs in a server-side environment.
Modules: JS has a limited module system, whereas Node.js has a powerful module system that enables developers to write modular and reusable code.
APIs: JS has a set of APIs for manipulating the Document Object Model (DOM), while Node.js has APIs for file system operations, networking, and more.
Global objects: JS has a set of global objects that are available in the browser environment, such as "window" and "document". Node.js has a different set of global objects that are available in the server-side environment, such as "process" and "console".
Threading: JS is single-threaded, which means it can only execute one task at a time. Node.js is also single-threaded, but its sophisticated architecture helps it to handle more concurrent connections.
Performance: JS is designed to run in a browser environment, so it is optimized for handling user interactions and rendering web pages. Node.js is optimized for handling I/O operations and network requests, so it is faster and more efficient for server-side tasks.

These points may also lead to the question, "what's the difference between CommonJS and ES6?"

Let us have a look at CommonJS vs ES modules in depth:

*Basis*	*CommonJS*	*ES Module*
Functionality	Works with the Node.js platform	Works with the web browser environment
Compilation	Compiled into AMD modules	Does not require a module loader like AMD
Dependencies	All dependencies are listed in the same file	Reference any other module in the same package available on the global namespace
Type-checking	No type-checking capabilities	Robust typing support via imports
Dependency Packaging	Packaging up functionality into small pieces	Declare dependencies between modules
File Structure	Flat files	References to other modules
Export	Exports in the same file	Exports scattered through the codebase
Import	No import functionality	Must use a require statement to access exported functions and properties

NOTE: In JavaScript, a module is a self-contained unit of code that defines a set of related functions, objects, and data. A module system is a way to organize and manage these modules in a codebase. CommonJS, ES6, and AMD are all module systems.

Why is Nodejs Scalable?

Node.js is scalable because of the non-blocking I/O model (mentioned above), which allows it to handle multiple concurrent connections efficiently. In traditional server-side models, each incoming connection would block the server, leading to degraded performance under high traffic. However, Node.js's event-driven, single-threaded architecture enables it to handle large numbers of requests simultaneously without getting bogged down.

When a request comes in, Node.js places it in a queue and continues processing other requests. When the I/O operation is complete, Node.js notifies the event loop, which then executes the corresponding callback function. This approach allows Node.js to handle multiple requests at the same time without blocking the main thread, making it a highly scalable platform for building networked applications.

Some drawbacks.

While Node.js has many advantages, it also has some disadvantages that developers should be aware of.

Not ideal for CPU-intensive applications: Node.js is designed to handle I/O-bound tasks, making it less efficient for applications that require a lot of CPU processing power. These types of applications could cause the event loop to block, slowing down the entire server.
Asynchronous Programming is difficult to understand for new developers.
Not suitable for heavy-load applications: While Node.js can handle a large number of concurrent connections, it may not be the best choice for extremely high-traffic applications, as it can become difficult to scale vertically.

This was mostly a theoretical blog on Nodejs. The main aim was to cram as much info into a single place to serve as a reference before interview prep. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

REST and REST API - Part 3

Saptarshi Bhattacharya — Mon, 06 Mar 2023 08:54:25 GMT

You will need to go through the 1st and 2nd blogs to get what's happening here.

Endpoint: https://hob-api.vercel.app/

GitHub repository: https://github.com/sbk2k1/API-Blog

/get Routes

GET HTTP requests are used to retrieve or fetch data from a server. This is typically used when a user requests to view a webpage or retrieve specific data from a server. Constraints include limited data storage capacity in the URL, which can impact the size of data that can be sent in a GET request, and the fact that GET requests are typically visible in browser histories and can be cached, which can impact security. Additionally, GET requests are not suitable for sending sensitive information, such as passwords or other authentication credentials, in the URL. A GET request implemented using REST does not have a request body.

Create a new request in Postman or your browser as shown below:

Fire it off!

If you get a Status 200 OK response, you'll probably get a response body like this:

{    "error": false,    "message": "Get Request Received Successfully",    "query_parameters": {},    "bearer_token": "Bearer Token is absent",    "headers": {        "host": "hob-api.vercel.app",        "x-real-ip": "xx.xxx.xxx.xxx",        "x-vercel-proxy-signature-ts": "xxxxxxxxx8",        "x-vercel-deployment-url": "hob-o362hip79-sbk2k1.vercel.app",        "x-vercel-ip-latitude": "xx.xxx",        "x-vercel-forwarded-for": "xx.xxx.xxx.xxx",        "x-vercel-id": "bom1::xv4kk-1678028068902-ec68b76f258e",        "forwarded": "for=xx.2xx.xx5.xxx;host=hob-api.vercel.app;proto=https;sig=0QmVhcmVyIDVhMDRiOGY1MmNkNjE0YTE4Zjc4Yjc3Y2Q1YWU5NmYzYWQ0MzVjMWNXXXXXXXXXXXXXjc0MTcwOGU=;exp=1678028368",        "postman-token": "xxxfxxxx-5e47-xxxa-a51a-xxxxxx43a60e",        "x-vercel-ip-longitude": "x8.xxx2",        "accept": "*/*",        "x-forwarded-for": "xx.xxx.xxx.xxx",        "x-forwarded-host": "hob-api.vercel.app",        "x-vercel-ip-country": "IN",        "x-forwarded-proto": "https",        "x-vercel-proxy-signature": "Bearer 5a04b8f52xxxxxxxxxxxxxxxxxxxx6f3ad435cxxxxxxxxxdbecd0d01eaff741708e",        "accept-encoding": "gzip, deflate, br",        "user-agent": "PostmanRuntime/x.xx.4",        "x-vercel-ip-country-region": "WB",        "x-vercel-ip-city": "Kolkata",        "x-vercel-ip-timezone": "Asia/Kolkata",        "x-vercel-proxied-for": "xx.xxx.xxx.1x4",        "connection": "close"    }}

Here's a breakdown of each part of the response:

"error": false: This indicates that there was no error in processing the request.
"message": "Get Request Received Successfully": This is a custom message returned by the server to indicate that the GET request was received successfully.
"query_parameters": {}: This is an empty object indicating that there were no query parameters included in the GET request.
"bearer_token": "Bearer Token is absent": This indicates that no bearer token was included in the request.
"headers": { ... }: This is an object containing information about the headers included in the GET request. Each key-value pair in this object represents a single header and its value.

Some of the headers included in this response include:

"host": "hob-api.vercel.app": This is the hostname of the server that received the request.
"x-real-ip": "xx.xxx.xxx.xxx": This is the IP address of the client that made the request.
"x-vercel-deployment-url": "hob-o362hip79-sbk2k1.vercel.app": This is the URL of the Vercel deployment that is handling the request.
"x-forwarded-proto": "https": This indicates that the request was made over HTTPS.
"user-agent": "PostmanRuntime/7.28.4": This is the user agent string of the client that made the request.
"connection": "close": This indicates that the connection will be closed after the response is sent.

Setting path and query parameters and bearer token

Let us configure the URL as such: (You can also add query params from the params tab)

https://hob-api.vercel.app/get/2?id=2&isDarkMode=true

Go to the "Headers" tab below the URL and set a new field called Authorization and set any value to it. I'm setting "This is my Authorization Token"

This is what the request looks like:

On firing it, this is what it looks like,

{    "error": false,    "message": "Get Request Received Successfully",    "path_parameter": "2 is the path parameter",    "query_parameters": {        "id": "2",        "isDarkMode": "true"    },    "bearer_token": "This is my Authorization Token",    "headers": {        "host": "hob-api.vercel.app",        "authorization": "This is my Authorization Token",        "x-real-ip": "45.250.245.184",        "x-vercel-proxy-signature-ts": "1678028816",        "x-vercel-deployment-url": "hob-o362hip79-sbk2k1.vercel.app",        "x-vercel-ip-latitude": "22.518",        "x-vercel-forwarded-for": "45.250.245.184",        "x-vercel-id": "bom1::xq858-1678028516261-e4cf9d6423ef",        "forwarded": "for=45.250.245.184;host=hob-api.vercel.app;proto=https;sig=0QmVhcmVyIDdmN2Y1MmI3NGNhMDRjNDUxYmQ1MDA5YzljOWNiNWMwMDIzMjJiZmZlYjc0MDZkMGE2MThiYjNkYmViNzQyY2M=;exp=1678028816",        "postman-token": "2c80ebc4-74cd-44c3-af96-72f27b0c67c5",        "x-vercel-ip-longitude": "88.3832",        "accept": "*/*",        "x-forwarded-for": "45.250.245.184",        "x-forwarded-host": "hob-api.vercel.app",        "x-vercel-ip-country": "IN",        "x-vercel-ip-country-region": "WB",        "x-vercel-proxy-signature": "Bearer 7f7f52b74ca04c451bd5009c9c9cb5c002322bffeb7406d0a618bb3dbeb742cc",        "accept-encoding": "gzip, deflate, br",        "user-agent": "PostmanRuntime/7.28.4",        "x-forwarded-proto": "https",        "x-vercel-ip-city": "Kolkata",        "x-vercel-ip-timezone": "Asia/Kolkata",        "x-vercel-proxied-for": "45.250.245.184",        "connection": "close"    }}

Describing the fields and what they are used for:

Query parameters: These are extra data passed in the URL of an HTTP request to filter or modify the response. They are commonly used to paginate, sort, or filter data, and are often visible in the URL bar of the browser. They are passed through the endpoint using "?" followed by parameters separated by a "&".
Path parameters: These are parts of the URL that represent a dynamic value. They are often used to identify a specific resource and can be used to make more meaningful and memorable URLs. These parameters are set using any value after the endpoint route using a "/".
Authentication header: This is a request header that contains an authentication token or credentials to prove the identity of the requester. It is commonly used to secure APIs and web applications and ensures that only authorized users have access to protected resources. They may contain encrypted data to check if the routes are accessible by a certain type of client entity.

NOTE: We will not be explaining path and query parameters or the authentication header/other headers anymore moving forward.

/post Routes

HTTP POST is a request method that is used to submit an entity to the specified resource, often causing a change in state or side effects on the server. The main purpose of the HTTP POST method is to create a new resource or to update an existing resource on the server.

The constraints for HTTP POST requests are:

The request payload must contain the entity that will be created or updated on the server.
The POST method may cause side effects, such as the creation of a new resource or the modification of an existing one.
Unlike GET requests, POST requests are not cacheable.
POST requests are typically used for creating, updating, or deleting resources on the server, so they require proper authentication and authorization.

Create a new request in Postman or your browser as shown below:

Go to the Body tab, select "raw" and then "JSON" from the drop-down. Enter any random data or paste the following body

{"field_1": 1,"name": "sbk2k1","github": "github.com/sbk2k1"}

Fire it off! The response looks like this if everything was 200 OK.

{    "error": false,    "message": "Post Request Received Successfully",    "body": {        "field_1": 1,        "name": "sbk2k1",        "github": "github.com/sbk2k1"    },    "query_parameters": {},    "bearer_token": "Bearer Token is absent",    "headers": {        ...    }}

In a POST request, the body of the request contains the data that is being sent to the server. This data can be in various formats such as JSON, XML, or plain text. The purpose of the body is to provide additional information to the server beyond the URL and headers, which are used to route the request and provide metadata about the request, respectively. The body is particularly useful for sending data to the server for processing, such as creating or updating a resource, submitting a form, or uploading a file. You can see the body you sent through the request. In an actual working backend, the body can be saved to an external database.

/put Routes

A PUT request is an HTTP method that is used to update an existing resource on the server. It is similar to the POST request, but it is used to modify an existing resource rather than create a new one.

The constraints of a PUT request are similar to that of a POST request. It requires that the client has the proper authorization to modify the resource, and it is idempotent, meaning that making multiple identical requests will have the same effect as a single request.

The body of a PUT request typically contains the updated representation of the resource being modified. This means that the client must send the entire representation of the resource, including any fields that have not changed.

One of the primary constraints of a PUT request is that it replaces the entire resource at the given URL. If the client only wants to update a specific field or subset of fields, a PATCH request should be used instead. Additionally, if the client is unsure if the resource already exists on the server, a POST request should be used instead of a PUT request.

NOTE: The PATCH method is not inherently dangerous, but it can be risky if not used carefully. The main reason is that PATCH requests are designed to make partial updates to an existing resource, rather than replacing it entirely. If the PATCH request is not crafted correctly, it could potentially overwrite important data, leading to unintended consequences or security vulnerabilities. Additionally, since PATCH is a less commonly used HTTP method, some systems may not support it or may handle it differently, which could lead to compatibility issues. This blog does not include PATCH functionality.

Change the request in Postman to a /put route and fire it off. It should look like this. We are using the same data as the post request.

The 200 OK response should look something like this.

{    "error": false,    "message": "Put Request Received Successfully",    "body": {        "field_1": 1,        "name": "sbk2k1",        "github": "github.com/sbk2k1"    },    "query_parameters": {},    "bearer_token": "Bearer Token is absent",    "headers": {        ...    }}

/delete Routes

The DELETE HTTP method is used to delete a resource identified by a URI (Uniform Resource Identifier). It is used to remove a resource from the server. The DELETE method is idempotent, which means that making the same request multiple times will produce the same result as making the request only once.

In RESTful API design, the DELETE method typically does not have a body because the resource to be deleted is identified by the URI. The server deletes the resource identified by the URI, and the response status code indicates the success or failure of the operation.

To specify the resource to be deleted, the client includes the resource's identifier in the URI. For example, a DELETE request to https://example.com/api/users/123 would delete the user with the ID of 123. If the resource to be deleted cannot be found, the server returns a 404 Not Found status code. (Status code refresher here)

Load up the request as shown below.

The 200 OK response is as follows.

{    "error": false,    "message": "Delete Request Received Successfully",    "query_parameters": {},    "bearer_token": "Bearer Token is absent",    "headers": {        ...    }}

NOTE: You can add a bearer token, query, or path parameters to any of the requests. I've simply skipped these for the brevity of this blog.

In conclusion, the four HTTP methods, GET, POST, PUT, and DELETE, serve different purposes in REST API design. GET is used to retrieve resources, POST to create new resources, PUT to update or replace existing resources, and DELETE to remove resources. Each method has its constraints and use cases that are important to consider when designing a RESTful API. Understanding the differences between these methods is crucial for creating effective and efficient APIs.

You may have understood the different technicalities, but this blog still does not highlight how these APIs are used in an actual application. Stay tuned for another series in which we will go through building an entire app and will show all implementations, code structure, and security/authentication measures commonly used. This should have provided you with a clear view of what REST APIs and REST are. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

REST and REST APIs - Part 2

Saptarshi Bhattacharya — Sun, 05 Mar 2023 15:09:23 GMT

In this blog, we will set up a REST API implemented using HTTP to play around and understand.

Resources

API endpoint: https://hob-api.vercel.app/
GitHub Repository: https://github.com/sbk2k1/API-Blog

Prerequisites

We will need some tools to get through this blog. They are:

Git/GitHub and npm/yarn(Necessary only if the deployed endpoint is down)
Postman (We can use cURL but Postman will provide us with a cleaner UI)

Installation

Git/GitHub (optional)- Check out my blog!
npm and yarn (optional)- To install npm, follow the below steps:
1. Download the Node.js installer from here.
2. Run the installer and follow the installation steps.
3. Once installed, open a command prompt or terminal and type npm -v to check the version of npm.

To install yarn, follow the below steps:

Download the Yarn installer from here.
Run the installer and follow the installation steps.
Once installed, open a command prompt or terminal and type yarn -v to check the version of Yarn.

Note: Yarn can also be installed using npm by running the command npm install -g yarn in the command prompt or terminal.

Postman: To install Postman, you can follow these steps:
1. Go to the Postman website.
2. Click the "Download" button for the version of Postman you want to install.
3. Follow the on-screen instructions to download the installation file.
4. Once the file is downloaded, open it and follow the instructions to install Postman.
5. Once the installation is complete, you can launch Postman by opening the application from your computer's application menu or by double-clicking on the Postman icon on your desktop (if you chose to create one during installation).

That's it! You're now ready to use Postman to test and explore APIs.

Setting up a local server (optional)

Follow the following steps to start up a server on your local system:

Clone the repository:

 git clone https://github.com/sbk2k1/API-Blog.git

Navigate to the directory:
```
 cd API-Blog
```
Install dependencies using either npm or yarn:
```
 npm install
```
or
```
 yarn install
```
Remove the two forward slashes (//) from the beginning of the app.listen line to uncomment it.
```
 app.listen(3000, () => {   console.log(`Server running on port 3000`) });
```
Add two forward slashes (//) at the beginning of the module.exports = app the line to comment it out.
```
 // module.exports = app;
```
Once the dependencies are installed, start the server:
```
 npm start
```
or
```
 yarn start
```
This will start the server on port 3000 by default.

Getting it to work

Note: I'm going to use the deployed endpoint for the blog, but you can use the http://localhost:3000 if you have it set up in your local system.

/ Route

Create a new request in Postman as shown below:

Fire it off and see what happens.

This is the response.

This signifies that the server is working fine. Note that the status in the top right that says 200 OK. (Status code refresher here)

You can also paste the link into your browser and see what happens. By default, when you enter a URL into a web browser and press enter, it sends a GET request to the server to retrieve the content of the specified resource.

We will look into each route and each little element in greater detail in the next blog. Stay tuned for part 3. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

REST and REST APIs - Part 1

Saptarshi Bhattacharya — Sat, 04 Mar 2023 17:36:37 GMT

This blog will talk about REST and RESTful APIs and will lead its way to the Backend Dev Blog.

So what is REST?

REST is an acronym for REpresentational State Transfer. It is a software architectural style or design pattern. An architectural pattern is a general, reusable solution to a commonly occurring problem in software architecture within a given context. Architectural patterns are often documented as software design patterns. REST was originally designed as a Web Architecture and its principles were presented by Roy Fielding, a computer scientist in his Ph.D. dissertation in 2000. REST-compliant systems, often called RESTful systems, are characterized by how they are stateless and separate the concerns of the client and server.

Some other Architectural Styles.

There are many recognized architectural styles and design patterns, among them:

Blackboard
Client-server (2-tier, 3-tier, n-tier, cloud computing exhibit this style)
Component-based
Data-centric
Event-driven (or implicit invocation)
Layered (or multilayered architecture)
Microservices architecture
Monolithic application
Peer-to-peer (P2P)
Pipes and filters
Plug-ins
Reactive architecture
Representational state transfer (REST)
Rule-based
Service-oriented
Shared nothing architecture
Space-based architecture

Disclaimer: I don't know about all of these. These are just to show what REST is and what are some of its alternatives.

Principles of REST

Uniform Interface: The following four constraints can achieve a uniform REST interface:
- Identification of resources The interface must uniquely identify each resource involved in the interaction between the client and the server.
- Manipulation of resources through representations The resources should have uniform representations in the server response.
- Self-descriptive messages Each resource representation should carry enough information to describe how to process the message. It must clearly convey information on the operations that can be done on a resource
- Hypermedia as the engine of application state The client should have only the initial URI of the application. The client application should dynamically drive all other resources and interactions with the use of hyperlinks.
Client and Server: The client is the entity that makes the demands and the server is the entity that handles the demands. The producer and consumer need to be separate for independent evolution. In the REST architectural style, the implementation of the client and the implementation of the server can be done independently without each knowing about the other. As long as each side knows what format of messages to send to the other, they can be kept modular and separate. Separating the user interface concerns from the data storage concerns, we improve the flexibility of the interface across platforms and improve scalability by simplifying the server components.
Statelessness: Systems that follow the REST paradigm are stateless, meaning that the server does not need to know anything about what state the client is in and vice versa. In this way, both the server and the client can understand any message received, even without seeing previous messages. This constraint of statelessness is enforced through the use of resources, rather than commands. Resources are the nouns of the Web - they describe any object, document, or thing that you may need to store or send to other services. This also means all the information needed to carry out a request is present in the request itself.
Cacheable: The cacheable constraint requires that a response should implicitly or explicitly label itself as cacheable or non-cacheable. If the response is cacheable, the client application gets the right to reuse the response data later for equivalent requests and a specified period.
Layered System: The layered system style allows an architecture to be composed of hierarchical layers by constraining component behavior. For example, in a layered system, each component cannot see beyond the immediate layer they are interacting with.
Code on Demand: This constraint is optional an API can be RESTful even without providing code on demand. The client can request code from the server, and then the response from the server will contain some code, usually in the form of a script, when the response is in HTML format. The client then can execute that code.

What is a Resource?

In REST, a resource is an object or piece of data that can be identified and manipulated using a unique identifier or URL.

What are Resource Methods?

Resource methods are used to perform the desired transition between two states of any resource.

NOTE: A large number of people wrongly relate resource methods to HTTP methods (i.e., GET/PUT/POST/DELETE). Roy Fielding has never mentioned any recommendation around which method to be used in which condition. All he emphasizes is that it should be a uniform interface.

REST and HTTP are Not the Same

Many people prefer to compare HTTP with REST. REST and HTTP are not the same. HTTP is the underlying protocol used for communication between clients and servers on the World Wide Web. REST, on the other hand, is an architectural style that provides a set of guidelines and constraints for designing web services that are scalable, reliable, and easy to maintain. During his dissertation, Roy Fielding never mentioned any direct implementation. He never talked about HTTP or any such protocols we use today.

REST APIs

This section is a leader to the Backend Development blog and will contain everything you need to know to get started with APIs.

What is an API?

An API (Application Programming Interface) is a set of protocols, tools, and standards for building software applications. It defines how different software components should interact with each other, allowing developers to create software applications that can communicate and exchange data with other systems. APIs can be used to access data, services, or functionality provided by other software applications, and they are essential for building modern web and mobile applications. You can think of a web API as a gateway between clients and resources on the web.

For example: Let's consider a restaurant. The customers who come in, and place the orders are the consumers. They are in this case known as the client. The producers are the chef and the kitchen crew, known here as the server (backend). The waiter is the entity that connects these two entities and ensures the smooth running of the restaurant (The software/app). Thus the waiter here is the API.

What are RESTful APIs?

A RESTful API is a type of web API that follows the principles of the REST architectural style. It is designed to provide a standard way of accessing and manipulating resource states over the web using various methods.

Why RESTful APIs

The benefits of using RESTful APIs include:

Scalability: RESTful APIs are designed to be scalable and can handle large volumes of requests and responses.
Flexibility: RESTful APIs can support different types of clients (e.g., web browsers, mobile apps, IoT devices) and can be used with different programming languages and frameworks.
Modularity: RESTful APIs are typically organized around resources, making them modular and easier to maintain.
Caching: RESTful APIs support caching of responses, which can improve performance and reduce server load.
Security: RESTful APIs can be secured using standard security protocols such as HTTPS and OAuth, making them more secure and less vulnerable to attacks.
Ease of Use: RESTful APIs are easy to use and understand, which makes them more accessible to developers of all skill levels.

Overall, RESTful APIs provide a standardized and scalable way to build modern web and mobile applications that can communicate and exchange data with other systems.

The Way of REST

The basic function of a RESTful API is the same as browsing the internet. The client contacts the server by using the API when it requires a resource. API developers explain how the client should use the REST API in the server application API documentation. These are the general steps for any REST API call:

The client sends a request to the server. The client follows the API documentation to format the request in a way that the server understands.
The server authenticates the client and confirms that the client has the right to make that request.
The server receives the request and processes it internally.
The server returns a response to the client. The response contains information that tells the client whether the request was successful. The response also includes any information that the client requested.

Client Request

What does the RESTful API client request contain? The components are:

Unique resource identifier: The server identifies each resource with unique resource identifiers. For REST services, the server typically performs resource identification by using a Uniform Resource Locator (URL). The URL specifies the path to the resource. It's called an endpoint.
Methods: Developers often implement RESTful APIs by using the Hypertext Transfer Protocol (HTTP). An HTTP method tells the server what it needs to do to the resource. The following are four common HTTP methods:
1. GET: Clients use GET to access resources that are located at the specified URL on the server. They can cache GET requests and send parameters in the RESTful API request to instruct the server to filter data before sending.
2. POST: Clients use POST to send data to the server. They include the data representation with the request. Sending the same POST request multiple times has the side effect of creating the same resource multiple times.
3. PUT: Clients use PUT to update existing resources on the server. Unlike POST, sending the same PUT request multiple times in a RESTful web service gives the same result.
4. DELETE: Clients use the DELETE request to remove the resource. A DELETE request can change the server state. However, if the user does not have appropriate authentication, the request fails.
HTTP headers: Request headers are the metadata exchanged between the client and server. For instance, the request header indicates the format of the request and response, provides information about the request status, and so on. It also has fields for authentication.
Parameters: RESTful API requests can include parameters that give the server more details about what needs to be done. The following are some different types of parameters:
- Path parameters that specify URL details.
- Query parameters that request more information about the resource.
- Cookie parameters that authenticate clients quickly.

Server Response

What does the RESTful server response request contain? The components are:

Status line: The status line contains a three-digit status code that communicates request success or failure. For instance, 2XX codes indicate success, but 4XX and 5XX codes indicate errors. 3XX codes indicate URL redirection.
The following are some common status codes:
- 200: Generic success response
- 201: POST method success response
- 400: Incorrect request that the server cannot process
- 404: Resource not found
Message body: The response body contains the resource representation. The server selects an appropriate representation format based on what the request headers contain. Clients can request information in XML or JSON formats, which define how the data is written in plain text.
Headers: The response also contains headers or metadata about the response. They give more context about the response and include information such as the server, encoding, date, and content type.

Common Response Status Codes

We can get a clear view of REST APIs in the next blog REST and REST APIs - Part 2. This should have provided you with a clear view of what REST APIs and REST are. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

How to contribute using Git and GitHub

Saptarshi Bhattacharya — Thu, 16 Feb 2023 15:32:43 GMT

This blog will be a step-by-step guide to my Git and GitHub workshop, for which all the resources can be found here. All code snippets will be provided in the different readme files. You can send PRs (even dummy ones), and create issues and I'll merge them. (probably)

What are Version Control Systems?

Version control systems are a category of software tools that helps in recording changes made to files by keeping a track of modifications done in the code.

Why do we need Version Control Systems?

Software projects are undertaken by multiple developers with different areas of specialty. They may be present in different locations, working at different times and on different functionalities/features. A version control system is a kind of software that helps the developer team to efficiently communicate and manage(track) all the changes that have been made to the source code along with information like who made and what changes have been made.

Some of the benefits of Version Control Systems are:

Enhances the project development speed by coordinating efforts between developers.
Enhances communication and productivity.
Provides a robust system to track changes and point out errors.
Effective for promoting remote work.
A fatal error can be easily fixed by rolling back to a previous version
Helps in recovery in case of any disaster or contingent situation,
Informs us about Who, What, When, and Why changes have been made.

What is Git?

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

It is an open-source project developed originally by Linus Torvalds in 2005 while creating the Linux Operating System Kernel.

It is used for tracking changes in any set of files, usually used for coordinating work among developers collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

What is GitHub?

GitHub is a Microsoft-owned company that provides Internet hosting services for software development and version control using Git. It makes it a lot easier for developers to work and collaborate on projects together.

GitHub provides a very user-friendly interface that initiates even beginners to version control. GitHub also works on promoting open-source development, community learning, and all other good stuff through different programs, but more on that later. (hmu if you want to know!)

Git vs GitHub

People get confused about the difference between Git and GitHub. What do they actually do? Which one to learn? (I was confused too)

So Git is the application installed on the local computer that lets you manage and track source code, while GitHub provides a lot more features! It provides only repository hosting services and lets you work on most of the Git features through an easy-to-use web UI.

So finally (keywords)

Git - On your local machine. Uses CLI. Track code and files. Helps create branches, unlike a lot of other VCSs. Made by Linus Torvalds. Open Source.
GitHub - Git repository hosting service. Cloud-based. Easy and cool UI. Can share with others. Visualize workflows. And a lot more!

Let's Git it!

Requirements

We'll need two things to start version controlling right away:

Git installed on our Local System: Head over to the Git Downloads Website. Download Git for your system (Windows/Mac). Click through the installation process and you're good.
Create a GitHub account: Head over to the GitHub Website and sign up to create your account!

Configuration

We need to configure Git for ourselves using the following commands in a Terminal or Command Prompt.

git config --global user.email "@example.com"git config --global user.name "

The Workflows

In this blog, I'll go through the entirety of two workflows.

Create and work on your own repository.
Contribute to an open-source repository.

We need to understand some terms before that.

Terminology

Open source - A software development philosophy that emphasizes transparency, collaboration, and community-driven innovation. Open-source projects make their source code publicly available for others to use, modify, and distribute freely.
Repository - A central location in Git where all the project's files and version history are stored. Developers can make changes to files and commit those changes to the repository.
Branch - A copy of the repository that allows developers to work on new features or bug fixes without affecting the main codebase. Once the changes are complete, they can be merged back into the main branch.
Pull request - A request to merge changes made in a branch into the main codebase. Other developers can review the changes and provide feedback before the changes are merged.
Fork - A copy of a repository that allows developers to make changes without affecting the original codebase. Forks are often used in open-source projects to contribute changes back to the original project.
Issue - A problem or task that needs to be addressed in a project. Issues can be used to track bugs, feature requests, or other tasks.
Commit - A snapshot of changes made to the codebase. Commits include a message describing the changes made and who made them.
Merge - The process of combining changes from one branch or repository into another. Merges can be used to integrate changes made in a fork back into the original project.
Clone - A copy of a repository that is stored on a local machine. Cloning a repository allows developers to work on the code without being connected to the internet.
Pull - The process of downloading changes made to a remote repository to a local machine. Pulling updates from a remote repository ensures that a local repository is up-to-date with the latest changes made by other developers.
Push - The process of uploading changes made from a local repository to a remote repository. Pushing changes allows other developers to see the changes and collaborate on the project.
gitignore- A file in a repository that specifies which files or directories should be excluded from version control. Files or directories listed in the .gitignore file will not be tracked by Git.
License - A legal agreement that defines how an open-source project can be used and distributed. Open source licenses typically allow others to use and modify the code but may require attribution or impose other conditions.

Workflow #1: Personal Project

To convert a directory to a git repository we need to use the git init command

git init

Let's open up a terminal in a certain directory and enter the command

Great! The directory is now a git repository and git is now tracking any changes done inside.

Let us then create two markdown files to store the commands we used until now.

Inside the terminal let us type the git status command to see which files are added to the staging area.

git status

We can add these files to the staging area using the git add command. We can use . instead of to signify that we want to add all the files in the repository to the staging area.

git add .

Let us type git status once again to check if the files are added to the staging area.

Great! The files are added to the staging area and are ready to be committed. We will commit the staged files using the git commit command

git commit -m "Commit Message"

Now the files are committed and git has stored a snapshot of the files. You can roll back to this commit if any errors pop up on further development.

We can now create a new repository on GitHub, by choosing all the required options.

After the empty GitHub repo is created, follow the second set of instructions, Push an existing repository

git remote add origin https://github.com//learn-gitgit push -u origin master

Now your code is hosted publicly and visible to everyone!

Stages in Git

Let us know visualize what happens when you use the add and commit messages.

This was the entire create and host your personal project workflow. You now have an understanding of the entire process and can create your own repositories on GitHub. You can find a definition of commands used in this section down below.

Commands

git init: Initializes a new Git repository in the current working directory. This creates a new .git directory that contains all the necessary files and subdirectories for Git to track changes in your project.
git add: Adds changes to the staging area, which is a temporary storage area for changes before they are committed. You can use this command to add specific files or directories, or you can use git add . it to add all changes in the current directory.
git status: Displays the current status of the working directory, including any changes that have been made but not yet staged, any changes that have been staged but not committed, and any untracked files. This command is useful for keeping track of what changes have been made and what still needs to be done.
git commit: Commits the changes in the staging area to the local Git repository. This creates a new commit with a unique identifier, a commit message, and a snapshot of the changes that were added to the staging area.
git remote set origin: Sets the remote repository that Git will use for pushing and pulling changes. This command sets the URL of the remote repository and gives it the name "origin", which is the default name for the primary remote repository.
git push: Pushes the committed changes to the remote repository. This sends the changes to the remote repository and updates the branch that you are working on. You can specify the remote repository and branch using the command git push .

Workflow #2: Contributing

This is the one you are mostly going to use when you are trying to contribute to an open-source repository. Contribution typically involves the following steps:

Fork the repository: Forking creates a copy of the original repository under your own account, which you can work on independently. To do this, navigate to the repository's GitHub page and click the "Fork" button in the upper right corner.
Clone the fork: Once you've forked the repository, you'll want to clone it to your local machine so you can make changes to it. To do this, run the following command in your terminal, replacing your-username with your GitHub username and repository-name with the name of the repository you forked:
```
 git clone git@github.com:your-username/repository-name.git
```
Create a new branch: It's generally a good idea to create a new branch for each set of changes you make. To do this, run the following command, replacing new-branch-name with a descriptive name for your new branch:
```
 git checkout -b new-branch-name
```
Visualization of branches in Git
Make changes: Now that you have the repository cloned and a new branch checked out, you can make changes to the code. Use your preferred text editor or IDE to edit the files.
Commit your changes: Once you've made the changes, you'll want to commit them to your local repository. To commit, refer to the personal workflow section of the blog.
```
 git commit -m "commit-message"
```
Push the changes: After committing your changes, you'll want to push them to your forked repository on GitHub. To do this, run the following command, replacing new-branch-name with the name of the branch, you created:
```
 git push -u origin new-branch-name
```
Create a pull request: Once you've pushed your changes to your forked repository, you can create a pull request to merge your changes into the original repository. To do this, navigate to the original repository's GitHub page and click the "New pull request" button. Select your fork and the branch you just pushed, and provide a description of your changes.
Respond to feedback: The maintainers of the original repository may request changes or ask questions about your pull request. Be sure to respond in a timely manner and make any requested changes.
Merge your changes: If the maintainers approve your pull request, they will merge your changes into the original repository. Congratulations, you've successfully contributed to an open-source project!

Note that some repositories may have slightly different workflows or conventions, so be sure to check the project's documentation or ask the maintainers if you're unsure.

From the above process, your code is still not in the master branch yet right? Teams usually establish a minimum amount of reviews to get a pull request merged. A reviewer might ask for code changes and, better documentation or anything else. Once you get enough numbers of eyes on your work, they can merge it! You can also send a PR to the master branch directly. Please refer to the documentation of the repository and check for guidelines for contribution.

Commands

git checkout: This command allows you to switch between different branches or versions of your code. When you run git checkout followed by a branch name, Git will replace the contents of your working directory with the version of the code stored in that branch. This is useful when you want to work on a different version of the code, or when you want to create a new branch to work on.
git branch: This command allows you to create, list, and delete branches in your Git repository. When you create a new branch, you create a separate version of the code that can be modified independently of the main branch. This allows you to experiment with changes without affecting the main codebase. You can also use git branch to see a list of all branches in your repository and to switch between them using git checkout.
git merge: This command allows you to combine changes from one branch into another. When you run git merge followed by the name of the branch you want to merge, Git will apply the changes made in that branch to the current branch. This is useful when you want to incorporate changes from a feature branch into the main codebase, or when you want to bring a forked repository up to date with the original repository.

Fun Fact :

gitignore is a file that specifies files or directories that Git should ignore when tracking changes in a repository. This is useful for files that are generated during the development process, such as log files or temporary files, or for sensitive information that should not be committed to the repository, such as API keys or passwords. By listing these files or directories in a .gitignore file, you can ensure that they are not accidentally committed to the repository.

gitkeep, on the other hand, is a file that is used to ensure that an otherwise empty directory is included in a Git repository. Git does not track empty directories, so if you want to include an empty directory in your repository, you can add a .gitkeep file to that directory. This file can be empty, but its presence will ensure that Git tracks the directory and includes it in the repository.

Now you know the two main workflows used by some of the biggest organizations in the world. With these workflows, we can ensure proper integration and collaboration of work where negative collisions are avoided in a workspace of multiple developers.

The knowledge of these tools will definitely help you build and dish out better software and open you up to a thriving community of developers working towards common goals. Let's Git it. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

My GitHub Campus Expert 🚩 Application Process [SELECTED]

Saptarshi Bhattacharya — Tue, 14 Feb 2023 07:36:29 GMT

Hello everyone, hope you're doing fine. I've been DM'ed a lot over LinkedIn and a lot of other Social Media Platforms regarding queries about my GitHub Campus Expert Journey and also about the selection process. This blog is an attempt to answer all those questions and help understand what are the necessary steps to follow to become a GitHub Campus Expert 🚩.

How did I come across the GitHub Campus Expert 🚩Program?

I had already got my hands on the GitHub Student Developers Pack which provides you with access to amazing developer tools worth a...lot of money. Probably 5 figures. USD. Probably even more. You can learn and improve your skills using the Developers Pack. (Check it out if you haven't).

Anyways, I was browsing through the GitHub Education website and discovered the GitHub Campus Expert 🚩 Program. After some reading and a quick YouTubing session, I clicked on the Become a Campus Expert button. (I was a bit early and asked them to notify me when the selection process started, but we'll just skip through that. :D)

So what is the GitHub Campus Expert 🚩Program?

According to GitHub- "Campus Experts are student leaders that strive to build diverse and inclusive spaces to learn skills, share their experiences, and build projects together. They can be found across the globe leading in-person and online conferences, meetups, and hackathons, and maintaining open source projects."

So as a Campus Expert 🚩, GitHub provides you with resources that help to grow your local community from scratch. It helps you in organizing events and doing everything else to engage and nurture your community. (You also get a lot of personal opportunities and networking opportunities!) Although GitHub Campus Experts 🚩 are not your average Campus Ambassadors, they are not GitHub Employees either. We represent and spread the boon of Git and GitHub and try to help grow local communities.

Eligibility

To apply for the program, you must:

Be 18 years of age or older
Have had a GitHub Account that's at least 6 months old.
Have the GitHub Student Developer Pack
Be enrolled in a formal higher education institution
Have at least one year before graduating

Step 1: Get the Pack

That's right. You need to get your hands on the GitHub Student Developers Pack to be eligible for the application to the program. You need to verify that you are a student, by uploading any Institute ID or your Institute-issued student email ID. Once you've been verified you can proceed to the next step which is filling up the initial application. This part is pretty self-explanatory and should be easy. (hmu if you still get stuck)

Step 2: The Form

Complete the application form with all the essays. Remember plagiarism is a sin and the system will auto-reject your application if it senses any form of plagiarism. The essays and the entire process checks for these things:

Potential: What do you want to do? What do you want to learn?
Motivation: Why do you want to do what you want to do?
Interest: Why do you want to be part of the program? What have to done to ensure success thus far?
Contribution: What will you be able to do once you're in? What have you already done?

Try to include these details in your essays:

Trace out your community. What kind of people does your community comprise?
Talk about the problems you have suffered building/working with your community. How have you tackled it (if you have) and how does it hamper people from learning/achieving goals?
Explain what the status quo is in your community.
Where will the GitHub Campus Expert 🚩Program come into the picture?
Talk about your values. Inclusivity, Diversity, making people feel safe and fostering a learning environment.
Talk about your goals.

Check out this blog for additional reference. (It's from GitHub)

Also maybe this one

Step 3: The Video

You will be notified via email at the end of the review if GitHub would like to move forward with your application. You'll then be asked to submit a video resume - a simple video of you talking about yourself, your community, and your visions for the community.

This helps the GitHub team get a better understanding and get to know you closely. So be confident and free in the way you speak. You'll have to submit the video resume in 2 weeks' time. It takes about a week to review the videos. You'll be notified via email if you make it through.

Reference video I used: ( Thank you Vaishnavi Dwivedi ! )

https://www.youtube.com/watch?v=wu-lQfoS6A0

Step 4: Training

If your submission was approved, Congratulations! Youve been accepted to the program. Youll go through the GitHub Campus Experts 🚩 Training. The training has six modules and takes 12 hours to complete in a span of 6 weeks.

The training will have 6 modules and takes 12 hours to complete in 6 weeks. Here youll be able to analyze your community and learn community leadership skills like Inclusivity, Information Design, Public Speaking, Communities, and Software Dev skills. At the end of your training, youll submit a community proposal that will serve as a guideline for your community and youll become a GitHub Campus Expert 🚩.

F.A.Q.

Q: Can you please share your training?

A: No the training modules are designed to help you learn and look out for things that are needed to become an effective Community Lead/ Expert. You should come up with answers yourself.

Q: Can you review my essays?

A: I'm sorry but I won't be able to help every one of you with your essays. But I can give you a few pointers. Try to be yourself. Reflect upon what you know about your community, and what you can and will do. Your intentions should speak. Do not try to plagiarize from others.

Q: My application got rejected. Can I reapply?

A: Yes absolutely, since the program looks for newer Campus Experts 🚩 every 6 months, you can go for it in the next semester.

I'll keep on adding the FAQs as I get more questions!

Thank You!

Thank you for being with me till here! I tried to cover everything that would be necessary to know about the process. Good Luck with your application. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Why is Random Forest better than Decision Trees?

Saptarshi Bhattacharya — Sun, 26 Dec 2021 13:00:16 GMT

Prerequisites: Familiarity with Decision Trees.

Ok, so let us consider the Dataset given below.

A Decision Tree fitting this dataset would look like something like this.

All good, right? But what if we encounter a change in the original data set? Would the Decision Tree still work as expected? Would it still generalize well to the changed Dataset and continue producing results in accordance with its earlier accuracy?

The answer is No.

Let's see by changing the original dataset.

If we just change these two entries to no instead of yes. The Decision tree would have a 50% chance of predicting the correct result. And this is just in the Training Set. We can safely say that any change in the original Data reflects hugely on the performance of the Decision Tree generalized on the unchanged Data. So what do we do?

We used an ensemble technique called Random Forest.

Question Alert!

What is an ensemble technique in Machine Learning?

Ensemble methods is a machine learning technique that combines several base models in order to produce one optimal predictive model. Meaning Apes Together Strong.

So, what actually happens in a Random Forest? Let's see.

Processes in Random Forests

Bootstrapping
Random Feature Selection
Simultaneous Model Training
Aggregation

Bootstrapping

We take rows from the dataset with replacement. We take as many rows as were present in the original dataset. This means that there may be repetitions in the new bootstrapped dataset. We make a number of these new datasets (a hyperparameter).

Notice that in the picture above, There are (maybe) repetitions in each bootstrapped dataset.

Random Feature Selection

We randomly select a few (another hyperparameter) columns (features) for each dataset.

For example First two features for the first dataset, the Last two for the second...

At this point, we have various individual Datasets. What to do next? Train a Decision Tree model on each of those.

Simultaneous Model Training

Build Trees for all datasets. This brings along all the hyperparameters associated with each Decision Tree.

We now have completed all the parts in Random Forest. The next question is...

How does it all come along?

Simple! The steps are

Bootstrap rows
Select Random Features
Train Decision Trees on each
Make predictions using all models. For Classification take the majority vote and for Regression take an average or weighted average or mode (whatever gives the best result).

FAQ

Q1. Why Random?

Ans. Due to Bootstrapping and Random Feature Selection which randomizes the datasets.

Q2. Why Forest?

Ans. Because there are more than 1 Decision Trees. Trees. Get it? :D

Q3. Why Bootstrap and Random Feature Selection?

Ans. Bootstrap ensures that we don't use the same data every time. So our model(s) is(are) less sensitive to the original dataset. Random Feature Selection on the other hand reduces the correlation between trees. If Random Feature Selection was not used all trees would produce very similar results and increase overall variance. Some trees would give bad results and others bad results in the opposite way thus balancing it out.

Q4. What's the ideal size of the feature subset?

Ans. 2 values usually give the best result

Log of the total number of features.
Square root of the total number of features

Note

High variance of Decision tree averages out to be low variance because
- Each tree recognizes a few features.
- We take majority or mean.
Change of Training Data will impact Random Forest much less than Decision Trees.

Materials

Links

Cheers!

High on Bugs!

NewsCom: Unleashing Community Voices

The Purpose and Goals

GitHub Authentication: A Pillar of Security and Engagement

Creating and Submitting Articles on the Webpage

Pull Requests and Collaborative Editing

Automation with GitHub Actions

Triggering Actions: From Collection to Compilation

Email Notifications: Connecting with Subscribers

Cleanup and Cloudinary Backup

Future Decisions

Conclusion

Technologies used

How to deploy your Website to GitHub Pages using GitHub Actions

Step 1: Organize Files

Step 2: Understanding GitHub Actions

Step 3: Create GitHub Action Workflow

Workflow Name and Trigger:

Job Configuration:

Steps:

Step 4: Create gh-pages Branch

Step 5: Set GitHub Token

Step 6: Remove Custom Domain CNAME (Optional)

Deploy your TypeScript Express App to Vercel (2024)

Step 1: Export app instead of listening on a certain port.

Step 2: Create an api folder for Vercel and set it up.

Step 3: Mention the API folder in tsconfig.json

Step 4: Create the Public folder

Step 5: Create vercel.json file

Step 6: Rewrite the Build Command in package.json

Step 7: Deploy

Setting Up My Simple Home Server: A Practical Guide

Chapter 1: Formatting the Old PC

Introduction

Backing Up Data

Chapter 2: Clean Slate

Chapter 3: Journey to Ubuntu Installation

Flashing the Drive

Navigating the BIOS

Setting Boot Priority

Installation Initiated

No Ethernet Cable, No Problem

Chapter 4: USB Tethering to Mobile Phone

Overcoming Connection Challenges During Installation

Introduction to USB Tethering

Setting Up IP and Gateway via USB Tethering

Achieving Internet Access

Chapter 5: RTL8812AU Driver Woes

Dealing with the Absence of an Official Driver

Discovering Community Wisdom on Mint Forum

Installing the Driver Using Readme Instructions

Setting Up Netplan to Configure Networking

Chapter 6: Unlocking Advanced Capabilities

SambaShare for NAS

Plex for Media Streaming

Docker, Minikube, and OpenSSH for Remote Development

Chapter 7: Conclusion

Striking the Right Chord: Gaming and Beyond with Python-Powered Audio Magic

Introduction

Capabilities

Motivation Behind the Madness:

I. The Foundations

Section 1.1: Python Object-Oriented Programming (OOP)

Section 1.2: Desktop Application with PyQt5 and PySide

II. Audio Processing

Section 2.1: Audio Driver Integration

Section 2.2: Fourier Transform for Frequency Analysis

III. Mapping Audio Frequencies to Actions

IV. Practical Applications

Section 4.1: Practical Applications

Section 4.2: Creative Possibilities

V. Conclusion and Further Exploration

Call to Action

Introduction to MongoDB - Part 1

How is MongoDB different?

Terminologies

How to design a Schema?

Querying and Manipulation

Scaling Techniques

Backend Development - Part 1

Step 4: Create `gh-pages` Branch

Step 1: Export `app` instead of listening on a certain port.

Step 2: Create an `api` folder for Vercel and set it up.

Step 3: Mention the API folder in `tsconfig.json`

Step 5: Create `vercel.json` file

Step 6: Rewrite the Build Command in `package.json`