<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[High on Bugs!]]></title><description><![CDATA[Articles on ML and System Design]]></description><link>https://highonbugs.sbk2k1.in</link><generator>RSS for Node</generator><lastBuildDate>Wed, 13 May 2026 18:14:07 GMT</lastBuildDate><atom:link href="https://highonbugs.sbk2k1.in/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Why Attention Is All You Need — A Dimensional and Mathematical Intuition Guide]]></title><description><![CDATA[1. Introduction
In 2017, Vaswani et al. dropped a paper titled “Attention Is All You Need,” and it quietly rewired the entire field of deep learning. Within a few years, its architecture — the Transformer — became the foundation for nearly every mode...]]></description><link>https://highonbugs.sbk2k1.in/why-attention-is-all-you-need-a-dimensional-and-mathematical-intuition-guide</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/why-attention-is-all-you-need-a-dimensional-and-mathematical-intuition-guide</guid><category><![CDATA[attention-mechanism]]></category><category><![CDATA[transformers]]></category><category><![CDATA[Attention Is All You Need]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Mon, 06 Oct 2025 08:08:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1759738057061/ab5330b8-8e50-4637-96fe-99594b67ad45.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-1-introduction">1. Introduction</h2>
<p>In 2017, Vaswani et al. dropped a paper titled <em>“Attention Is All You Need,”</em> and it quietly rewired the entire field of deep learning. Within a few years, its architecture — the <strong>Transformer</strong> — became the foundation for nearly every modern AI system: GPTs, BERT, diffusion models, even vision networks.</p>
<p>Before this paper, sequence modeling relied on <strong>recurrent networks (RNNs and LSTMs)</strong> that processed data <em>step-by-step</em>, passing information forward through time. That meant slow training, limited parallelism, and the infamous problem of forgetting information from distant tokens.</p>
<p>The Transformer proposed a radical shift:</p>
<blockquote>
<p><em>Forget time; learn relationships.</em></p>
</blockquote>
<p>Instead of iterating over tokens sequentially, each token could directly <strong>“attend” to every other token</strong> in the sequence, capturing context in a <em>single forward pass</em>. This attention-based mechanism not only removed recurrence but also made training fully parallelizable — perfect for GPUs.</p>
<p>In this post, we’ll rebuild the intuition and math behind the paper:</p>
<ul>
<li><p>How RNNs evolved into attention mechanisms?</p>
</li>
<li><p>What “self-attention” really computes?</p>
</li>
<li><p>How dimensionality flows through Q, K, VQ, K, VQ, K, V projections?</p>
</li>
<li><p>Why multiple heads and feedforward layers matter?</p>
</li>
<li><p>And how does the encoder–decoder structure tie it all together?</p>
</li>
</ul>
<p>By the end, you should be able to <strong>visualize every transformation in terms of both meaning and shape</strong>, and truly see why <em>attention was, and still is, all we needed.</em></p>
<h2 id="heading-2-rnns-what-they-were-and-why-they-broke">2. RNNs — What They Were and Why They Broke</h2>
<p>Before the Transformer, nearly every sequential model used <strong>Recurrent Neural Networks (RNNs)</strong>.<br />RNNs process sequences token by token while maintaining a hidden "memory" of what came before.</p>
<hr />
<h3 id="heading-21-what-are-rnns">2.1 What are RNNs?</h3>
<p>At each time step <em>t</em>, an RNN updates a hidden state <strong>hₜ</strong> using the current input <strong>xₜ</strong> and the previous hidden state <strong>hₜ₋₁</strong>:</p>
<p><strong>hₜ = f(Wₓ · xₜ + Wₕ · hₜ₋₁)</strong><br /><strong>yₜ = Wᵧ · hₜ</strong></p>
<p>Here:</p>
<ul>
<li><p><em>xₜ</em> → input vector at step <em>t</em></p>
</li>
<li><p><em>hₜ</em> → hidden state (the model’s internal memory)</p>
</li>
<li><p><em>f</em> → activation function (usually tanh or ReLU)</p>
</li>
</ul>
<p>This creates a chain of dependencies — every output depends on all previous steps.</p>
<hr />
<h3 id="heading-22-the-core-problems">2.2 The Core Problems</h3>
<p><strong>1. Sequential Dependency</strong><br />Each step depends on the previous one. You can’t compute step <em>t+1</em> until <em>t</em> is finished.</p>
<ul>
<li>This makes training and inference very slow and non-parallelizable.</li>
</ul>
<p><strong>2. Vanishing and Exploding Gradients</strong><br />During backpropagation, gradients pass through many time steps.</p>
<ul>
<li><p>If weights are small, gradients vanish, and early tokens are forgotten.</p>
</li>
<li><p>If weights are large, gradients explode and training becomes unstable.</p>
</li>
</ul>
<p><strong>3. Information Decay</strong><br />The hidden state is a single fixed-size vector that must store <em>all</em> past context.<br />Older information fades as new information arrives — much like trying to remember the start of a long sentence.</p>
<p><strong>4. Long Inference Time</strong><br />Inference must also be sequential. You can’t predict multiple tokens at once because each depends on the last output.</p>
<h2 id="heading-3-transformer-intuition-from-memory-chains-to-attention-maps">3. Transformer Intuition — From Memory Chains to Attention Maps</h2>
<p>Recurrent models view sequences as chains: information flows step by step. The Transformer introduced a new way of thinking — instead of passing information through time, it lets every token directly connect to every other token.</p>
<p>This is the essence of <strong>attention</strong>.</p>
<hr />
<h3 id="heading-31-the-core-idea">3.1 The Core Idea</h3>
<p>In an RNN, the token at position <em>t</em> can only use information passed from earlier positions.<br />In a Transformer, the token at position <em>t</em> can "look" at every other token in the sequence, including itself, and decide <strong>which ones are relevant</strong>.</p>
<p>This means:</p>
<ul>
<li><p>No recurrence or time dependency.</p>
</li>
<li><p>All tokens are processed <strong>in parallel</strong>.</p>
</li>
<li><p>Context is learned by comparing tokens directly.</p>
</li>
</ul>
<hr />
<h3 id="heading-32-the-intuitive-analogy">3.2 The Intuitive Analogy</h3>
<p>Think of reading a sentence like “The animal didn’t cross the street because it was too tired.”</p>
<p>When you read the word “it”, you don’t have to replay the entire sentence sequentially. You instantly recall the relevant part — “the animal”.<br />That’s exactly what attention does: each token <strong>attends</strong> to the parts of the sequence that matter most for understanding its own meaning.</p>
<hr />
<h3 id="heading-33-computation-as-relationships">3.3 Computation as Relationships</h3>
<p>The Transformer encodes these relationships through a set of <strong>learnable projections</strong>:</p>
<ul>
<li><p>Each token’s embedding is projected into three spaces: <strong>Query (Q)</strong>, <strong>Key (K)</strong>, and <strong>Value (V)</strong>.</p>
</li>
<li><p>The query of one token measures how much it relates to the keys of all other tokens.</p>
</li>
<li><p>The result is a weighted combination of their values, forming a new representation for that token.</p>
</li>
</ul>
<p>Mathematically, for each token:</p>
<ul>
<li><p>Attention weights = softmax(Q · Kᵀ)</p>
</li>
<li><p>Output = Attention weights × V</p>
</li>
</ul>
<p>This mechanism directly models pairwise relationships between tokens, rather than relying on sequential memory.</p>
<hr />
<h3 id="heading-34-why-this-matters">3.4 Why This Matters</h3>
<p>The Transformer’s self-attention lets the model:</p>
<ul>
<li><p>Capture <strong>global dependencies</strong> between tokens (not limited by distance).</p>
</li>
<li><p>Train <strong>in parallel</strong>, since all tokens attend simultaneously.</p>
</li>
<li><p>Retain <strong>long-term context</strong> efficiently.</p>
</li>
</ul>
<p>In short, attention turns sequential data into a <strong>fully connected relationship graph</strong> between tokens, computed in a single step.</p>
<hr />
<h3 id="heading-35-the-shift-in-perspective">3.5 The Shift in Perspective</h3>
<p>Before the Transformer, “sequence” implied “time”.<br />After it, “sequence” became a <strong>set of relationships</strong>.</p>
<p>The model doesn’t think in terms of steps; it thinks in terms of <strong>contextual relevance</strong>.<br />This shift is what enabled modern large language models — systems that learn meaning by understanding the <em>relationships between words</em>, not their positions in a timeline.</p>
<h2 id="heading-4-input-representation">4. Input Representation</h2>
<p>Before attention can operate, the raw tokens of a sequence must be converted into vectors that the model can process. This is done in two steps: <strong>token embeddings</strong> and <strong>positional encodings</strong>.</p>
<hr />
<h3 id="heading-41-token-embeddings">4.1 Token Embeddings</h3>
<ul>
<li><p>Each word or token is mapped to a <strong>dense vector</strong> of dimension <code>d_model</code>.</p>
</li>
<li><p>If the input sequence has <code>n</code> tokens, the embedding matrix <code>X</code> has shape:</p>
</li>
</ul>
<pre><code class="lang-elixir">X ∈ ℝ^(n × d_model)
</code></pre>
<ul>
<li><p>These embeddings capture semantic meaning — similar words have similar vector representations.</p>
</li>
<li><p>At this stage, there is <strong>no positional information</strong>; the model doesn’t know which token comes first or last.</p>
</li>
</ul>
<hr />
<h3 id="heading-42-positional-encodings">4.2 Positional Encodings</h3>
<p>Since the Transformer <strong>does not process tokens sequentially</strong>, we need to inject information about <strong>token positions</strong> in the sequence.</p>
<p>The paper uses <strong>sinusoidal positional encodings</strong>:</p>
<ul>
<li>For each position <code>pos</code> and dimension <code>i</code>:</li>
</ul>
<pre><code class="lang-elixir">PE(pos, <span class="hljs-number">2</span>i)   = sin(pos / <span class="hljs-number">10000</span>^(<span class="hljs-number">2</span>i / d_model))
PE(pos, <span class="hljs-number">2</span>i+<span class="hljs-number">1</span>) = cos(pos / <span class="hljs-number">10000</span>^(<span class="hljs-number">2</span>i / d_model))
</code></pre>
<ul>
<li><p>This produces a vector <code>PE</code> of the same dimension as the token embeddings (<code>d_model</code>).</p>
</li>
<li><p>These encodings allow the model to <strong>distinguish order</strong> and learn relative positions without recurrence.</p>
</li>
</ul>
<hr />
<h3 id="heading-43-combining-embeddings-and-positional-encodings">4.3 Combining Embeddings and Positional Encodings</h3>
<p>The final input to the Transformer is the <strong>sum</strong> of token embeddings and positional encodings:</p>
<pre><code class="lang-elixir">E = X + PE
</code></pre>
<ul>
<li><p>Shape of <code>E</code>: <code>n × d_model</code></p>
</li>
<li><p>This combined representation contains both <strong>semantic meaning</strong> and <strong>positional information</strong>.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737286219/c14a4dd7-cac4-4c76-bb91-d592cc82c617.png" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-44-intuition">4.4 Intuition</h3>
<ul>
<li><p>Each token now has a vector that tells the model:</p>
<ul>
<li><p><em>What the token is</em> (embedding)</p>
</li>
<li><p><em>Where it is in the sequence</em> (positional encoding)</p>
</li>
</ul>
</li>
<li><p>The Transformer can now apply <strong>attention</strong>, knowing both content and position.</p>
</li>
<li><p>Sinusoids are used instead of learned embeddings because they allow the model to <strong>extrapolate to longer sequences</strong> than seen during training.</p>
</li>
</ul>
<h2 id="heading-5-self-attention-mechanism-single-head">5. Self-Attention Mechanism (Single Head)</h2>
<p>The key innovation of the Transformer is <strong>self-attention</strong>, a mechanism that allows each token in a sequence to consider all other tokens when forming its representation. Unlike RNNs, which rely on sequential steps to propagate information, self-attention provides each token with <strong>direct access to the entire sequence</strong> in a single step.</p>
<hr />
<h3 id="heading-51-from-embeddings-to-queries-keys-and-values">5.1 From Embeddings to Queries, Keys, and Values</h3>
<p>Starting from the input embeddings <code>E</code> (shape <code>n × d_model</code>), the model generates three separate projections for each token: <strong>Query (Q)</strong>, <strong>Key (K)</strong>, and <strong>Value (V)</strong>.</p>
<ul>
<li><p><strong>Query (Q)</strong> represents what the token is “looking for”</p>
</li>
<li><p><strong>Key (K)</strong> represents the content of the token to be compared against queries</p>
</li>
<li><p><strong>Value (V)</strong> carries the actual information of the token</p>
</li>
</ul>
<p>These projections are obtained by multiplying <code>E</code> with learnable weight matrices:</p>
<pre><code class="lang-elixir">Q = E · W_Q
K = E · W_K
V = E · W_V
</code></pre>
<ul>
<li><p>W_Q, W_K, W_V ∈ ℝ^(d_model × d_k)</p>
</li>
<li><p>Resulting shapes: Q, K, V ∈ ℝ^(n × d_k)</p>
</li>
</ul>
<p>Here, <code>d_k</code> is typically smaller than <code>d_model</code> for efficiency, but all tokens are now ready for interaction.</p>
<hr />
<h3 id="heading-52-computing-attention">5.2 Computing Attention</h3>
<p>Self-attention measures <strong>how much each token should attend to every other token</strong>. This is done in three steps:</p>
<ol>
<li>Compute similarity scores between queries and keys:</li>
</ol>
<pre><code class="lang-elixir">Scores = Q · Kᵀ      <span class="hljs-comment"># shape: n × n</span>
</code></pre>
<ol start="2">
<li>Scale the scores by √d_k to prevent excessively large values that destabilize gradients:</li>
</ol>
<pre><code class="lang-elixir">Scores_scaled = Scores / √d_k
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737382028/fa2a01e0-735a-44d8-8500-ee4e2f100b6b.png" alt class="image--center mx-auto" /></p>
<p><strong>NOTE</strong> THAT THE 6X6 MATRIX DENOTES RELATIONSHIPS OF EACH TOKEN WITH OTHER TOKENS IN THE STRING. WE CAN MANIPULATE THIS MANUALLY AS WELL. THIS WILL BE USED IN MASKING FUTURE TOKENS IN THE DECODER BLOCK</p>
<ol start="3">
<li>Apply softmax to convert scores into attention weights:</li>
</ol>
<pre><code class="lang-elixir">Weights = softmax(Scores_scaled)
</code></pre>
<ol start="4">
<li>Multiply the weights by the values to get the output:</li>
</ol>
<pre><code class="lang-elixir">Output = Weights · V    <span class="hljs-comment"># shape: n × d_v</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737416795/747e8ab4-119d-4441-98e9-548fb45f7859.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>Each row in the output corresponds to a <strong>contextualized vector</strong> for that token.</p>
</li>
<li><p>In essence, each token gathers information from the entire sequence, weighted by relevance.</p>
</li>
</ul>
<hr />
<h3 id="heading-53-intuition">5.3 Intuition</h3>
<p>Imagine the sentence: “The animal didn’t cross the street because it was tired.”</p>
<p>When processing the token “it,” self-attention allows it to look at every other word.<br />It assigns higher weights to “animal” (its antecedent) and lower weights to unrelated tokens like “street” or “cross.”</p>
<p>Unlike RNNs, this mechanism <strong>does not rely on sequential propagation</strong>, allowing the model to capture long-range dependencies efficiently.</p>
<hr />
<h3 id="heading-54-dimensional-flow">5.4 Dimensional Flow</h3>
<ul>
<li><p><strong>Input embeddings:</strong> <code>E</code> → n × d_model</p>
</li>
<li><p><strong>Projections:</strong> Q, K, V → n × d_k</p>
</li>
<li><p><strong>Attention scores:</strong> Q · Kᵀ → n × n</p>
</li>
<li><p><strong>Weighted sum:</strong> Weights · V → n × d_v</p>
</li>
</ul>
<p>Even a single attention head enables <strong>global context modeling</strong> in one step.<br />Every token’s new representation is a <strong>context-aware summary</strong> of the sequence.</p>
<h2 id="heading-6-multi-head-attention">6. Multi-Head Attention</h2>
<p>While a single attention head allows each token to attend to the entire sequence, it has a limitation: it can only focus on one type of relationship at a time. <strong>Multi-head attention</strong> solves this by allowing the model to learn multiple types of relationships in parallel.</p>
<hr />
<h3 id="heading-61-why-multiple-heads">6.1 Why Multiple Heads?</h3>
<p>Each attention head operates in its own subspace of the token embeddings. This allows the model to:</p>
<ul>
<li><p>Capture different types of dependencies simultaneously (e.g., syntactic, semantic, positional)</p>
</li>
<li><p>Focus on multiple aspects of the sequence at the same time</p>
</li>
<li><p>Improve representation diversity and richness</p>
</li>
</ul>
<p>For example, in the sentence “The animal didn’t cross the street because it was tired,” one head might focus on <strong>subject-verb relationships</strong>, while another focuses on <strong>pronoun references</strong>.</p>
<hr />
<h3 id="heading-62-how-it-works">6.2 How It Works</h3>
<ol>
<li><p>Start with the input embeddings <code>E</code> (shape <code>n × d_model</code>).</p>
</li>
<li><p>For each of the <code>h</code> heads, project <code>E</code> into its own <strong>Q, K, V</strong> matrices:</p>
</li>
</ol>
<pre><code class="lang-elixir">Q_i = E · W_Qi
K_i = E · W_Ki
V_i = E · W_Vi
</code></pre>
<ul>
<li><p>W_Qi, W_Ki, W_Vi ∈ ℝ^(d_model × d_k), where d_k = d_model / h</p>
</li>
<li><p>Each head computes attention independently:</p>
</li>
</ul>
<pre><code class="lang-elixir">head_i = Attention(Q_i, K_i, V_i)
</code></pre>
<ol start="3">
<li>Concatenate the outputs of all heads:</li>
</ol>
<pre><code class="lang-elixir">Concat(head_1, ..., head_h)   <span class="hljs-comment"># shape: n × d_model</span>
</code></pre>
<ol start="4">
<li>Project the concatenated output back to <code>d_model</code> with a matrix W_O:</li>
</ol>
<pre><code class="lang-elixir">MultiHeadOutput = Concat(heads) · W_O
</code></pre>
<ul>
<li><p>W_O ∈ ℝ^(d_model × d_model)</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737465037/484314c8-4732-41e5-be85-443ba92ea7c0.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<hr />
<h3 id="heading-63-dimensional-flow">6.3 Dimensional Flow</h3>
<ul>
<li><p><strong>Input embeddings:</strong> n × d_model</p>
</li>
<li><p><strong>Each head Q, K, V:</strong> n × d_k (d_k = d_model / h)</p>
</li>
<li><p><strong>Attention per head:</strong> n × d_k</p>
</li>
<li><p><strong>Concatenated heads:</strong> n × d_model</p>
</li>
<li><p><strong>Final projection:</strong> n × d_model</p>
</li>
</ul>
<p>This ensures that, no matter how many heads are used, the output has the same shape as the input, allowing <strong>residual connections</strong> and smooth stacking of layers.</p>
<hr />
<h3 id="heading-64-intuition">6.4 Intuition</h3>
<ul>
<li><p>Think of each head as a <strong>specialized lens</strong> focusing on a particular type of relationship in the sequence.</p>
</li>
<li><p>By combining multiple lenses, the model develops a <strong>multi-faceted understanding</strong> of the input.</p>
</li>
<li><p>Multi-head attention is therefore a powerful way to <strong>increase model expressiveness without increasing sequence length or token dimensions</strong>.</p>
</li>
</ul>
<h2 id="heading-7-layer-normalization">7. Layer Normalization</h2>
<p>After multi-head attention, each token has a new contextual representation. Before passing it through the next sublayer (like the feedforward network), it is important to <strong>stabilize and normalize</strong> these representations. This is where <strong>Layer Normalization (LayerNorm)</strong> comes in.</p>
<hr />
<h3 id="heading-71-why-not-batch-normalization">7.1 Why Not Batch Normalization?</h3>
<p>Batch Normalization works by normalizing across the <strong>batch dimension</strong>. While this is effective for images and other fixed-size inputs, it has two main issues for sequences:</p>
<ul>
<li><p>Sequences can have <strong>different lengths</strong>. Padding tokens introduce noise if normalized across the batch.</p>
</li>
<li><p>Each token should maintain <strong>independence</strong>; batch statistics mix token information across samples, which is undesirable for attention-based models.</p>
</li>
</ul>
<p>LayerNorm solves both problems by normalizing <strong>across features for each token individually</strong>, not across the batch.</p>
<hr />
<h3 id="heading-72-how-layernorm-works">7.2 How LayerNorm Works</h3>
<p>For a token representation <code>x ∈ ℝ^d_model</code>:</p>
<ol>
<li>Compute the mean and variance across features:</li>
</ol>
<pre><code class="lang-elixir">μ = (<span class="hljs-number">1</span>/d_model) * Σ x_i
σ² = (<span class="hljs-number">1</span>/d_model) * Σ (x_i - μ)²
</code></pre>
<ol start="2">
<li>Normalize and scale:</li>
</ol>
<pre><code class="lang-elixir">LN(x) = γ * (x - μ) / sqrt(σ² + ε) + β
</code></pre>
<ul>
<li><p>γ and β are learnable parameters (scale and shift)</p>
</li>
<li><p>ε is a small constant for numerical stability</p>
</li>
</ul>
<p>The output has the <strong>same shape as the input</strong> (<code>d_model</code>), but features are normalized, which stabilizes training and improves convergence.</p>
<hr />
<h3 id="heading-73-intuition">7.3 Intuition</h3>
<ul>
<li><p>LayerNorm ensures that <strong>each token’s vector has a consistent scale</strong>, preventing some features from dominating attention or the feedforward network.</p>
</li>
<li><p>Normalization is done <strong>per token</strong>, so padding or variable-length sequences do not affect other tokens.</p>
</li>
<li><p>Combined with <strong>residual connections</strong>, LayerNorm allows deeper networks to train effectively without vanishing or exploding gradients.</p>
</li>
</ul>
<p>Check out this video for a better understanding of why LayerNorm is used rather than BatchNorm in Sequential Contexts. (The video is in Hindi, but should be easy to understand)</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=qti0QPdaelg">https://www.youtube.com/watch?v=qti0QPdaelg</a></div>
<p> </p>
<hr />
<h3 id="heading-74-position-in-the-transformer">7.4 Position in the Transformer</h3>
<ul>
<li>LayerNorm is applied <strong>after the residual connection</strong> in each sublayer:</li>
</ul>
<pre><code class="lang-elixir">Output = LayerNorm(x + Sublayer(x))
</code></pre>
<ul>
<li>This structure is repeated for both <strong>multi-head attention</strong> and <strong>feedforward sublayers</strong>, keeping token-wise representations stable throughout the stack.</li>
</ul>
<h2 id="heading-8-feedforward-fully-connected-network">8. Feedforward Fully Connected Network</h2>
<p>After each token passes through multi-head attention, the Transformer applies a <strong>position-wise feedforward network (FFN)</strong>. Unlike attention, which mixes information across tokens, the FFN operates <strong>independently on each token</strong>, enriching its representation with nonlinear transformations.</p>
<hr />
<h3 id="heading-81-structure-of-the-feedforward-network">8.1 Structure of the Feedforward Network</h3>
<p>For a token vector <code>x ∈ ℝ^d_model</code>, the FFN consists of <strong>two linear layers with a ReLU activation</strong> in between:</p>
<pre><code class="lang-elixir">FFN(x) = max(0, x · W1 + b1) · W2 + b2
</code></pre>
<ul>
<li><p>W1 ∈ ℝ^(d_model × 4*d_model)</p>
</li>
<li><p>W2 ∈ ℝ^(4*d_model × d_model)</p>
</li>
<li><p>b1, b2 ∈ ℝ^(bias vectors)</p>
</li>
</ul>
<p>Key points:</p>
<ul>
<li><p>The hidden layer expands the dimension to <strong>4×d_model</strong>, allowing the network to model more complex relationships.</p>
</li>
<li><p>The final layer projects back to <strong>d_model</strong> to match the residual connection.</p>
</li>
</ul>
<hr />
<h3 id="heading-82-role-and-intuition">8.2 Role and Intuition</h3>
<ul>
<li><p><strong>Per-token reasoning</strong>: Each token can combine features in nonlinear ways without affecting other tokens.</p>
</li>
<li><p><strong>Higher-dimensional context</strong>: Expanding the dimension allows the model to create richer transformations and interactions within the token vector.</p>
</li>
<li><p><strong>Complement to attention</strong>: While attention captures <strong>relationships between tokens</strong>, the FFN processes <strong>features within a token</strong>, adding expressivity.</p>
</li>
</ul>
<p>Think of it as giving each token its own “neural mini-network” to refine its meaning after gathering context from attention.</p>
<hr />
<h3 id="heading-83-dimensional-flow">8.3 Dimensional Flow</h3>
<ol>
<li><p>Input to FFN: <code>x</code> → shape n × d_model</p>
</li>
<li><p>First linear layer + ReLU: → n × 4*d_model</p>
</li>
<li><p>Second linear layer: → n × d_model</p>
</li>
<li><p>Residual connection ensures the <strong>output shape remains n × d_model</strong>, compatible with stacking multiple layers.</p>
</li>
</ol>
<h2 id="heading-9-encoder-architecture">9. Encoder Architecture</h2>
<p>The Transformer encoder is a <strong>stack of identical layers</strong>, each designed to process the entire input sequence in parallel while capturing both <strong>token relationships</strong> and <strong>per-token transformations</strong>.</p>
<hr />
<h3 id="heading-91-the-encoder-block">9.1 The Encoder Block</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737654555/f92048e1-3dda-4979-a3dc-856ee091ba70.png" alt class="image--center mx-auto" /></p>
<p>Each encoder layer consists of the following components:</p>
<ol>
<li><p><strong>Multi-Head Self-Attention (MHA)</strong></p>
<ul>
<li><p>Allows each token to attend to every other token in the sequence.</p>
</li>
<li><p>Captures global relationships, independent of token order (positional information comes from embeddings).</p>
</li>
</ul>
</li>
<li><p><strong>Residual Connection + Layer Normalization</strong></p>
<ul>
<li><p>The input to the attention sublayer is added to its output:</p>
<pre><code class="lang-elixir">  x1 = LayerNorm(x + MHA(x))
</code></pre>
</li>
<li><p>Stabilizes gradients and preserves the original token information.</p>
</li>
</ul>
</li>
<li><p><strong>Feedforward Fully Connected Network (FFN)</strong></p>
<ul>
<li><p>Processes each token independently through two linear layers with ReLU, expanding and compressing dimensions:</p>
<pre><code class="lang-elixir">  x2 = LayerNorm(x1 + FFN(x1))
</code></pre>
</li>
</ul>
</li>
</ol>
<ul>
<li>Each encoder block maintains the input/output shape: <strong>n × d_model</strong>, allowing multiple layers to be stacked without changing dimensionality.</li>
</ul>
<hr />
<h3 id="heading-92-stacking-layers">9.2 Stacking Layers</h3>
<ul>
<li><p>The Transformer encoder consists of <strong>N identical layers</strong> stacked on top of each other.</p>
</li>
<li><p>Each layer refines the token representations by alternating between:</p>
<ul>
<li><p><strong>Global attention</strong> (multi-head)</p>
</li>
<li><p><strong>Local transformation</strong> (feedforward network)</p>
</li>
</ul>
</li>
<li><p>This combination ensures that after several layers, each token has a <strong>rich, context-aware representation</strong> that incorporates both <strong>relationships to all other tokens</strong> and <strong>complex feature transformations</strong>.</p>
</li>
</ul>
<hr />
<h3 id="heading-93-intuition">9.3 Intuition</h3>
<ul>
<li><p>Think of the encoder as a <strong>deep contextualizer</strong>:</p>
<ul>
<li><p>Multi-head attention gathers relevant information from the sequence.</p>
</li>
<li><p>FFN processes the token’s own features.</p>
</li>
<li><p>LayerNorm + residuals keep the flow stable.</p>
</li>
</ul>
</li>
<li><p>Stacking N layers allows the model to <strong>refine both global and local representations</strong> repeatedly, increasing expressiveness without changing the sequence length or token dimension.</p>
</li>
</ul>
<h2 id="heading-10-decoder-architecture">10. Decoder Architecture</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759737689342/f0ec3aee-9853-4c4b-912f-f2fb2dc3786d.png" alt class="image--center mx-auto" /></p>
<p>The Transformer decoder is responsible for <strong>generating output sequences</strong>, such as translated text. It combines <strong>self-attention</strong>, <strong>cross-attention</strong>, and <strong>feedforward networks</strong>, while respecting the <strong>causal order</strong> of generation.</p>
<hr />
<h3 id="heading-101-masked-multi-head-self-attention">10.1 Masked Multi-Head Self-Attention</h3>
<ul>
<li><p>In the decoder, each token can <strong>only attend to previous tokens</strong> and itself.</p>
</li>
<li><p>This ensures <strong>autoregressive generation</strong>: future tokens are not seen during training or inference.</p>
</li>
<li><p>Implemented by <strong>masking the upper triangle</strong> of the attention score matrix:</p>
</li>
</ul>
<pre><code class="lang-elixir">Scores_masked = Q · Kᵀ / √d_k
Scores_masked[future_positions] = -∞
Weights = softmax(Scores_masked)
Output = Weights · V
</code></pre>
<ul>
<li>The mask prevents information leakage from future tokens, enforcing causality.</li>
</ul>
<p><strong>NOTE</strong> THIS IS THE MANUAL MANIPULATION OF CONTEXT SCORES THAT WAS MENTIONED EARLIER IN THE SELF ATTENTION SECTION</p>
<hr />
<h3 id="heading-102-cross-attention-with-encoder-outputs">10.2 Cross-Attention with Encoder Outputs</h3>
<ul>
<li><p>After masked self-attention, the decoder performs <strong>cross-attention</strong>:</p>
<ul>
<li><p>Queries (Q) come from the decoder’s previous layer outputs</p>
</li>
<li><p>Keys (K) and Values (V) come from the encoder’s final outputs</p>
</li>
</ul>
</li>
</ul>
<pre><code class="lang-elixir">CrossAttention(Q_dec, K_enc, V_enc)
</code></pre>
<ul>
<li><p>This allows the decoder to <strong>condition its generation</strong> on the input sequence.</p>
</li>
<li><p>Intuitively, the decoder “looks at” the encoder’s representation to decide which information is relevant for generating the next token.</p>
</li>
</ul>
<hr />
<h3 id="heading-103-feedforward-network-and-residuals">10.3 Feedforward Network and Residuals</h3>
<ul>
<li>Similar to the encoder, each decoder block contains a <strong>position-wise FFN</strong> with ReLU:</li>
</ul>
<pre><code class="lang-elixir">Output = LayerNorm(Input + FFN(Input))
</code></pre>
<ul>
<li>Residual connections and layer normalization stabilize training and maintain the token dimension <code>d_model</code>.</li>
</ul>
<hr />
<h3 id="heading-104-overall-decoder-block-flow">10.4 Overall Decoder Block Flow</h3>
<ol>
<li><p><strong>Masked Multi-Head Self-Attention</strong> → Add &amp; Norm</p>
</li>
<li><p><strong>Cross Multi-Head Attention</strong> (with encoder outputs) → Add &amp; Norm</p>
</li>
<li><p><strong>Feedforward Network</strong> → Add &amp; Norm</p>
</li>
</ol>
<ul>
<li><p>Each decoder layer maintains the input/output shape: <strong>n × d_model</strong>, allowing stacking of N layers.</p>
</li>
<li><p>The decoder can now generate sequences <strong>autoregressively</strong>, using attention to both past outputs and the encoder’s representation.</p>
</li>
</ul>
<hr />
<h3 id="heading-105-intuition">10.5 Intuition</h3>
<ul>
<li><p>Masked self-attention ensures <strong>future tokens do not influence current predictions</strong></p>
</li>
<li><p>Cross-attention allows the model to <strong>condition on the input sequence</strong></p>
</li>
<li><p>Feedforward networks provide <strong>local per-token reasoning</strong>, just like in the encoder</p>
</li>
<li><p>Together, these components allow the decoder to generate fluent, contextually correct sequences <strong>one token at a time</strong></p>
</li>
</ul>
<h2 id="heading-11-training-vs-inference">11. Training vs Inference</h2>
<p>Transformers behave differently during <strong>training</strong> and <strong>inference</strong>, and understanding this distinction is key to grasping how they generate sequences efficiently.</p>
<hr />
<h3 id="heading-111-training">11.1 Training</h3>
<ul>
<li><p>During training, the <strong>entire target sequence is available</strong> at once.</p>
</li>
<li><p>Masking ensures <strong>causal behavior</strong>: each token can only attend to previous tokens, preventing information leakage from the future.</p>
</li>
<li><p>The main advantages of training in parallel:</p>
<ul>
<li><p><strong>Fully parallelizable</strong>: all tokens in the sequence are processed simultaneously, leveraging GPU acceleration</p>
</li>
<li><p><strong>Stable gradients</strong>: longer sequences no longer suffer from vanishing information as in RNNs</p>
</li>
<li><p><strong>Faster convergence</strong>: context is learned for all tokens in one forward pass</p>
</li>
</ul>
</li>
<li><p>Loss is computed for all tokens simultaneously, usually using <strong>cross-entropy</strong> between predicted and actual next-token distributions.</p>
</li>
</ul>
<hr />
<h3 id="heading-112-inference">11.2 Inference</h3>
<ul>
<li><p>During inference, sequences are generated <strong>token by token</strong> (autoregressively).</p>
</li>
<li><p>For each new token:</p>
<ol>
<li><p>The decoder attends to <strong>all previously generated tokens</strong> using masked self-attention</p>
</li>
<li><p>The decoder attends to <strong>encoder outputs</strong> via cross-attention</p>
</li>
<li><p>The next token is predicted based on the output distribution</p>
</li>
</ol>
</li>
<li><p>This process repeats until an <strong>end-of-sequence token</strong> is produced.</p>
</li>
<li><p>Key point: <strong>generation is sequential</strong>, but the underlying attention mechanism still allows each token to consider <strong>all past context efficiently</strong>.</p>
</li>
</ul>
<hr />
<h3 id="heading-113-intuition">11.3 Intuition</h3>
<ul>
<li><p><strong>Training</strong>: “See everything at once, learn relationships in parallel.”</p>
</li>
<li><p><strong>Inference</strong>: “Predict one token at a time, using previous context.”</p>
</li>
</ul>
<p>This separation explains why Transformers can <strong>train extremely fast</strong> compared to RNNs while still generating sequences <strong>autoregressively</strong> when needed.</p>
<h2 id="heading-12-key-insights-amp-closing-thoughts">12. Key Insights &amp; Closing Thoughts</h2>
<p>The Transformer architecture, introduced in <em>“Attention Is All You Need”</em>, represents a paradigm shift in sequence modeling. Here are the core takeaways:</p>
<hr />
<h3 id="heading-121-key-insights">12.1 Key Insights</h3>
<ul>
<li><p><strong>No Recurrence, No Convolution</strong>: Unlike RNNs or CNNs, Transformers rely entirely on attention to model relationships between tokens.</p>
</li>
<li><p><strong>Global Context via Self-Attention</strong>: Each token can attend to all others in the sequence, enabling long-range dependencies in a single step.</p>
</li>
<li><p><strong>Parallel Training</strong>: Training is fully parallelizable, solving the sequential bottleneck of RNNs.</p>
</li>
<li><p><strong>Separation of Concerns</strong>:</p>
<ul>
<li><p><strong>Attention</strong> handles global, cross-token context</p>
</li>
<li><p><strong>Feedforward networks</strong> handle per-token transformations and feature reasoning</p>
</li>
</ul>
</li>
<li><p><strong>LayerNorm + Residuals</strong> stabilize deep architectures, allowing many stacked layers without vanishing gradients.</p>
</li>
<li><p><strong>Masked Decoding</strong>: Ensures autoregressive generation during inference, while allowing the model to learn efficiently in parallel during training.</p>
</li>
</ul>
<hr />
<h3 id="heading-122-closing-thoughts">12.2 Closing Thoughts</h3>
<ul>
<li><p>Transformers have reshaped NLP and AI by providing a <strong>scalable, interpretable, and highly expressive architecture</strong>.</p>
</li>
<li><p>The same attention mechanisms extend beyond text: <strong>Vision Transformers, audio modeling, and even diffusion models</strong> use similar principles.</p>
</li>
<li><p>Intuitive takeaway: <strong>“Attention is the language of relationships.”</strong> Each token communicates with others, forming a rich, context-aware understanding of the sequence.</p>
</li>
</ul>
<hr />
<p>All sections have been put together using this video.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=bCz4OMemCcA&amp;t=2500s">https://www.youtube.com/watch?v=bCz4OMemCcA&amp;t=2500s</a></div>
<p> </p>
<p>If you still have any queries, you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Building for the Worst Case: The Google File System]]></title><description><![CDATA[Introduction
What happens when you need to store the entire web? That’s the kind of problem Google faced in the early 2000s, and the solution they came up with was the Google File System (GFS).
Today I read through the GFS paper — my first real syste...]]></description><link>https://highonbugs.sbk2k1.in/google-file-system</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/google-file-system</guid><category><![CDATA[Google]]></category><category><![CDATA[System Design]]></category><category><![CDATA[paper]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 31 Aug 2025 14:15:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756649688201/177b7149-cf31-45dc-be56-c598010b0a34.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>What happens when you need to store the entire web? That’s the kind of problem Google faced in the early 2000s, and the solution they came up with was the <strong>Google File System (GFS)</strong>.</p>
<p>Today I read through the GFS paper — my first real system design paper. I didn’t approach it as an academic exercise, but as a curious software engineer trying to understand how systems this big actually work. I used to think about software mostly in terms of APIs, libraries, and backend services. But this paper opened a different lens: how do you build something that works at the scale of an entire company like Google?</p>
<p>It was a mix of overwhelming and fascinating. What stood out is that GFS isn’t about fancy algorithms or textbook elegance. It’s about <strong>looking at messy realities — drives failing constantly, files being massive, writes being mostly appends — and making design choices that fit that reality</strong>.</p>
<p>That’s the lens I want to capture in this blog: not just what GFS is, but <strong>why those choices were made, what tradeoffs they carried, and how correctly judging needs is the most important step in designing large systems.</strong></p>
<h2 id="heading-setting-the-stage-observations-amp-assumptions">Setting the Stage: Observations &amp; Assumptions</h2>
<p>Before diving into design, the GFS paper starts with something that feels almost too simple: <strong>observing reality.</strong> Google looked at how their systems were actually being used, and that shaped everything.</p>
<p>Some of the key things they noticed:</p>
<ul>
<li><p>Drives fail all the time. So the system must expect failures as normal, not rare.</p>
</li>
<li><p>Files are <em>huge</em> (multi-gigabyte scale), so small optimizations for tiny files don’t matter much.</p>
</li>
<li><p>Reads are the most common operation. Writes happen too, but usually as appends — random overwrites are rare.</p>
</li>
</ul>
<p>From these observations came their <strong>assumptions</strong>:</p>
<ul>
<li><p>Use inexpensive drives instead of high-end, reliable ones.</p>
</li>
<li><p>Expect a modest number of very large files rather than billions of tiny ones.</p>
</li>
<li><p>Prioritize bandwidth and availability over raw latency.</p>
</li>
<li><p>Design for multiple clients writing at once, so the consistency semantics must be defined.</p>
</li>
</ul>
<p>What I liked here is how straightforward this feels: instead of assuming “the perfect file system,” they started by asking <em>what actually matters for us.</em> That mindset feels like a huge lesson in itself.</p>
<h2 id="heading-core-design-masterchunkserver-model">Core Design: Master–Chunkserver Model</h2>
<p>Once the assumptions were set, GFS introduced its core idea: <strong>split the world into one master and many chunkservers.</strong></p>
<ul>
<li><p><strong>The master keeps all the metadata</strong>: which files exist, how they’re split into chunks, and where those chunks live.</p>
</li>
<li><p>The <strong>chunkservers</strong> store the actual file data in large 64 MB chunks.</p>
</li>
<li><p>Clients talk to the master only to figure out <em>where</em> data lives, and then they go directly to the chunkservers to read or write.</p>
</li>
</ul>
<p>This setup has some nice effects:</p>
<ul>
<li><p>Because chunks are so large, there’s less metadata to keep track of, which means fewer lookups and less chatter across the network.</p>
</li>
<li><p>Keeping metadata in memory makes the master fast.</p>
</li>
<li><p>No backend caching means no cache coherence headaches.</p>
</li>
</ul>
<p>But there are clear tradeoffs too:</p>
<ul>
<li><p>The master is a <strong>single point of failure</strong>, even though it logs operations, checkpoints the state, and has replicas.</p>
</li>
<li><p>Large chunk sizes can cause internal fragmentation.</p>
</li>
<li><p>Small files that sit in a single chunk can create <strong>hotspots</strong> if too many clients hammer the same chunkserver.</p>
</li>
</ul>
<p>Centralizing metadata in a master makes the system simpler to reason about, even if it brings its own risks.</p>
<p>The master is the brain of GFS, and one of its most interesting jobs is handling the <strong>namespace</strong> — all the directories and files. Instead of a traditional per-directory data structure, GFS uses a flat <strong>in-memory mapping of full pathnames to metadata</strong>, with <strong>prefix compression</strong> to save space.</p>
<p>Locks ensure consistency during namespace operations:</p>
<ul>
<li><p><strong>Read lock:</strong> prevents a directory from being deleted, renamed, or snapshotted while files inside are being accessed.</p>
</li>
<li><p><strong>Write lock:</strong> applied when modifying a specific file.</p>
</li>
<li><p>Example: if <code>/home/user</code> is being snapshotted to <code>/save/user</code>, then <code>/home/user/foo</code> cannot be created at the same time.</p>
</li>
</ul>
<p>This design keeps operations simple and fast, while still allowing multiple files in the same directory to be updated concurrently.</p>
<h2 id="heading-consistency-amp-mutations">Consistency &amp; Mutations</h2>
<p>One of the hardest parts of any distributed file system is <strong>consistency</strong>: how do you make sure all clients see the same data, even when multiple replicas and clients are involved? GFS doesn’t aim for strict textbook consistency. Instead, it defines a model that works for its workloads.</p>
<p>Two key terms the paper defines:</p>
<ul>
<li><p><strong>Consistent:</strong> all clients see the same data across replicas.</p>
</li>
<li><p><strong>Defined:</strong> all concurrent clients see their own changes fully reflected — no half-written or corrupted states.</p>
</li>
</ul>
<h3 id="heading-how-gfs-enforces-this">How GFS enforces this</h3>
<ul>
<li><p><strong>Mutation ordering:</strong> whenever data is written or appended, the same sequence of mutations is applied across all replicas.</p>
</li>
<li><p><strong>Versioning:</strong> each chunk has a version number. If a replica falls behind or becomes corrupted, the master detects it and garbage-collects it.</p>
</li>
<li><p><strong>Recovery from corruption:</strong> if a replica fails checksum verification, the master triggers a clone from another healthy replica.</p>
</li>
</ul>
<h4 id="heading-example-consistent-but-undefined">Example: “Consistent but Undefined”</h4>
<p>Imagine three clients appending log entries (A, B, C) at the same time to the same chunk:</p>
<ul>
<li><p>The primary decides the order: A → B → C.</p>
</li>
<li><p>If something goes wrong (say B fails mid-way), the chunk might contain <strong>A, padding, C or A, B, B, C</strong>.</p>
</li>
<li><p>All replicas will have the same view (so it’s <em>consistent</em>), but the exact sequence of log entries might not perfectly match intentions (so it’s <em>undefined</em> until the client retries).</p>
</li>
</ul>
<h3 id="heading-handling-writes-vs-record-appends">Handling Writes vs. Record Appends</h3>
<ul>
<li><p><strong>Normal writes</strong> are broken into transactions and applied in strict order.</p>
</li>
<li><p><strong>Record appends</strong> (which are very common in Google’s workloads) are trickier because multiple clients may append simultaneously:</p>
<ul>
<li><p>The primary replica decides the order.</p>
</li>
<li><p>If an append would overflow a chunk, the system pads the chunk and continues on the next one.</p>
</li>
<li><p>Each append is atomic: either it appears in full or not at all.</p>
</li>
<li><p>The tradeoff: duplicates or extra padding may appear, but applications can clean those up later.</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-why-this-matters">Why this matters</h3>
<p>This model may sound a bit “loose” compared to strong consistency, but it matches Google’s needs. Most workloads are append-heavy (like logs), so atomic appends are far more important than perfect overwrite semantics. By relaxing guarantees, GFS achieves simpler and faster recovery.</p>
<p>For me, the big takeaway is that <strong>consistency isn’t one-size-fits-all.</strong> GFS shows how systems can define their own version of “good enough” consistency that aligns with real workloads — in this case, reliable appends and quick recovery over strict guarantees.</p>
<h2 id="heading-system-interaction-flow">System Interaction Flow</h2>
<p>Once you understand the master–chunkserver split, the next question is: <em>how do clients actually read and write data in this setup?</em> GFS uses a mix of leases, pipelined data flow, and acknowledgments to keep things orderly.</p>
<h3 id="heading-lease-mechanism">Lease Mechanism</h3>
<ul>
<li><p>For each chunk, the <strong>master grants a lease</strong> to one replica (called the <em>primary</em>).</p>
</li>
<li><p>The lease usually lasts <strong>60 seconds,</strong> but can be extended indefinitely if the primary keeps sending heartbeat messages.</p>
</li>
<li><p>The primary decides the <strong>order of mutations</strong> for that chunk, while secondaries simply follow along.</p>
</li>
<li><p>If needed, the master can revoke a lease and reassign it elsewhere.</p>
</li>
</ul>
<p>This approach avoids the chaos of multiple replicas competing to decide mutation order.</p>
<h3 id="heading-write-flow">Write Flow</h3>
<p>Here’s how a write actually happens:</p>
<ol>
<li><p>The client asks the master which chunk holds the data. If no lease exists, the master chooses a primary and informs the client of all replicas.</p>
</li>
<li><p>The client <strong>pushes data to all replicas</strong> in a pipelined fashion. Each chunk server stores it in a buffer.</p>
</li>
<li><p>Once all replicas have acknowledged receiving the data, the client sends a <strong>write request to the primary</strong>.</p>
</li>
<li><p>The primary assigns a mutation order and forwards the request to the secondaries.</p>
</li>
<li><p>The secondaries apply the change and acknowledge back to the primary.</p>
</li>
<li><p>The primary finally replies to the client: success or error.</p>
</li>
</ol>
<p>Large writes are split into multiple chunks, but the flow is the same.</p>
<h3 id="heading-data-flow-optimization">Data Flow Optimization</h3>
<p>Instead of broadcasting in a tree-like pattern, GFS pipelines data linearly through replicas (like a chain). This minimizes network bottlenecks: each chunkserver only passes data to the “closest” next replica, reducing load and making use of TCP’s bandwidth efficiently.</p>
<h3 id="heading-atomic-record-appends">Atomic Record Appends</h3>
<p>Appends are slightly different:</p>
<ul>
<li><p>The client sends data to all replicas of the <em>last chunk of the file</em>.</p>
</li>
<li><p>The primary checks whether appending would overflow the chunk. If yes, it pads the chunk and moves the append to the next one.</p>
</li>
<li><p>To avoid worst-case fragmentation, each record append is capped at 1/4 of the maximum chunk size.</p>
</li>
</ul>
<p>The result is atomicity: the record either appears fully or not at all, even if duplicates or padding slip in.</p>
<h3 id="heading-my-takeaway">My Takeaway</h3>
<p>I really liked how <strong>the flow prioritizes order and throughput</strong> rather than chasing latency. By making the primary responsible for sequencing and keeping data movement linear, GFS avoids a ton of potential complexity. It shows how clever data flow design is just as important as storage structure.</p>
<h2 id="heading-features-that-support-scale">Features That Support Scale</h2>
<p>Once the basics of storing and writing chunks were solved, GFS added features that made the system easier to manage and more reliable as it grew. These features aren’t flashy, but they’re the backbone of why GFS worked at scale.</p>
<h3 id="heading-snapshots">Snapshots</h3>
<ul>
<li><p>GFS can create a snapshot of a file or directory tree almost instantly.</p>
</li>
<li><p>Instead of copying everything, it uses <strong>copy-on-write (CoW):</strong></p>
<ul>
<li><p>The master logs the snapshot operation.</p>
</li>
<li><p>Metadata is copied, but the actual chunk data isn’t duplicated until a write happens.</p>
</li>
<li><p>If a chunk is written to later, the system creates a new local copy.</p>
</li>
</ul>
</li>
<li><p>Because chunks are only copied when needed (and locally, not over the network), snapshots are cheap and fast.</p>
</li>
</ul>
<h3 id="heading-replica-placement">Replica Placement</h3>
<p>Reliability and availability depend heavily on <strong>where replicas live</strong>:</p>
<ul>
<li><p>New replicas are placed on chunkservers with below-average disk usage.</p>
</li>
<li><p>The system avoids putting too many new replicas on the same server.</p>
</li>
<li><p>Replicas are spread across racks for fault tolerance.</p>
</li>
<li><p>Re-replication kicks in when chunks fall below the target replication level (default = 3).</p>
</li>
<li><p>Load balancing happens by migrating chunks around.</p>
</li>
</ul>
<p>The master continuously balances between preventing hotspots and maximizing network bandwidth.</p>
<h3 id="heading-garbage-collection">Garbage Collection</h3>
<p>Deleting files isn’t immediate in GFS:</p>
<ul>
<li><p>A deletion first unlinks the file from the namespace.</p>
</li>
<li><p>The actual chunks are removed lazily once the master sees they are no longer referenced.</p>
</li>
<li><p>Heartbeats between chunkservers and the master help synchronize which replicas can be cleaned up.</p>
</li>
<li><p>The delay allows for recovery if something was deleted by mistake.</p>
</li>
</ul>
<p>It’s a <strong>uniform and dependable cleanup mechanism</strong>, but the tradeoff is that deleted space isn’t reclaimed instantly.</p>
<h3 id="heading-stale-replica-deletion">Stale Replica Deletion</h3>
<ul>
<li><p>Each replica has a version number.</p>
</li>
<li><p>When a new lease is issued, the version number increments.</p>
</li>
<li><p>Any replica with an older version is garbage-collected as stale.</p>
</li>
<li><p>The master decides correctness by looking at the highest version it knows.</p>
</li>
</ul>
<p>Scaling here isn’t just about handling bigger workloads — it’s about making the system self-maintaining over time.</p>
<h3 id="heading-fault-tolerance-amp-diagnosis">Fault Tolerance &amp; Diagnosis</h3>
<p>GFS was designed with the assumption that <strong>failures are the norm, not the exception.</strong> Disks fail, nodes disappear, networks hiccup — so the system builds resilience directly into its core.</p>
<ul>
<li><p><strong>Chunk Replication:</strong> Every chunk is stored across multiple chunkservers (default: 3 replicas). This ensures that even if one server or disk dies, the data isn’t lost. Replication also helps balance read loads across the cluster.</p>
</li>
<li><p><strong>Master Replication:</strong> The master’s metadata is too important to risk, so it’s periodically checkpointed. Backups of the operation log + checkpoints mean the master can be restored quickly after a crash.</p>
</li>
<li><p><strong>Stale Replica Detection:</strong> Each chunk carries a version number. When the master sees a mismatch (say, one replica lags after a crash), it marks that replica as stale and triggers a replacement copy from a valid peer.</p>
</li>
<li><p><strong>Chunkserver Death:</strong> The master continuously pings chunkservers. If one doesn’t respond within a timeout window, it assumes failure and schedules replication of its chunks elsewhere.</p>
</li>
<li><p><strong>Data Integrity:</strong> Chunkservers store checksums for fixed-size blocks inside each chunk. During reads, they verify checksums before serving data, automatically repairing if corruption is detected. <strong>Also cost of integrity is saved during append writes, where the checksum is just calculated and not verified. A subsequent read verifies the checksum. Only in the case of overwrites the first and last blocks of overwritten data are checked before checksum is calculated or write is done.</strong></p>
</li>
</ul>
<p>What stood out to me was how <strong>failure recovery wasn’t bolted on later — it was baked into the design philosophy.</strong> Instead of chasing absolute reliability of machines (expensive and unrealistic at Google’s scale), GFS treated machines as disposable and focused on <strong>fast detection + automatic recovery.</strong></p>
<p>This mindset — “failures will happen, let’s plan for them” — feels like one of the most transferable lessons from GFS to modern distributed systems.</p>
<h2 id="heading-benchmarks-amp-performance-insights">Benchmarks &amp; Performance Insights</h2>
<p>The GFS team benchmarked their system under production-like workloads, and the results reinforced their design choices:</p>
<ul>
<li><p><strong>Reads scaled well.</strong> With large sequential reads being dominant, spreading them across chunkservers worked effectively.</p>
</li>
<li><p><strong>Writes scaled less so.</strong> Random overwrites weren’t optimized and remained slow — but this was acceptable since overwrites were rare.</p>
</li>
<li><p><strong>Appends dominated.</strong> This justified the record append API and relaxed consistency model.</p>
</li>
<li><p><strong>Hotspots appeared.</strong> Some chunkservers serving popular chunks became overloaded. This revealed a weakness of the design, though load-balancing strategies helped mitigate it.</p>
</li>
<li><p><strong>Reads &gt; Writes.</strong> Read-heavy workloads aligned perfectly with GFS’s optimizations.</p>
</li>
<li><p><strong>Master bottlenecks fixed.</strong> Metadata handling was initially a bottleneck, but redesigned data structures improved throughput.</p>
</li>
<li><p><strong>Recovery speed tuned.</strong> Too-fast recovery would overwhelm the network; too-slow would risk availability. The team tuned the replication speed for balance.</p>
</li>
<li><p><strong>Bimodal request sizes.</strong> Reads and writes were either tiny (control-like) or large batched operations. GFS was tuned to handle this distribution.</p>
</li>
</ul>
<p>Overall, the benchmarks showed that GFS aligned tightly with Google’s workloads: heavy reads, frequent appends, rare overwrites, and resilience under failure.</p>
<h2 id="heading-key-lessons-amp-takeaways">Key Lessons &amp; Takeaways</h2>
<p>The Google File System was never meant to be a general-purpose filesystem. It was purpose-built for Google’s reality: massive data, commodity hardware, frequent failures, and workloads dominated by large sequential reads and record appends. From the paper, a few timeless lessons stand out:</p>
<p>Some timeless lessons:</p>
<ul>
<li><p><strong>Design for the workload, not the textbook.</strong> GFS broke conventions (e.g., append-only writes, single master) because they fit Google’s reality.</p>
</li>
<li><p><strong>Failures are normal.</strong> Instead of preventing them, GFS embraced replication, checksums, and fast recovery.</p>
</li>
<li><p><strong>Simplicity scales.</strong> Big chunks, in-memory metadata, append semantics — crude on paper, brilliant in practice.</p>
</li>
<li><p><strong>Balance matters.</strong> Tuning replication speed, recovery policies, and replica placement kept the system both reliable and practical.</p>
</li>
<li><p><strong>System design evolves with use.</strong> Benchmarks revealed bottlenecks, which guided iterative improvements.</p>
</li>
</ul>
<h4 id="heading-real-world-impact">Real-World Impact</h4>
<ul>
<li><p><strong>HDFS (Hadoop Distributed File System):</strong> Directly inspired by GFS, enabling the big data revolution.</p>
</li>
<li><p><strong>Log-based systems today:</strong> Kafka, Pulsar, and even modern databases rely heavily on append-only logs — an idea normalized by GFS.</p>
</li>
<li><p><strong>Cloud storage systems:</strong> GFS’s philosophy (cheap hardware + replication + recovery) still underpins services like S3, GCP Storage, and Azure Blob.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Reading the GFS paper was a reminder that large-scale system design isn’t about perfection; it’s about tradeoffs. It’s about correctly judging, assessing, validating, and then making the right design choices for the system you need. Google didn’t try to build a “general-purpose, flawless” file system — they built one that worked for their workload, even if it looked odd from a traditional CS lens.</p>
<hr />
<p>If you still have any queries, you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Why I Finally Stopped Treating My Database Like a Black Box]]></title><description><![CDATA[Introduction
When I started working with Oracle SQL, my relationship with databases was simple: they stored my data, and I fetched it when I needed it. SELECT, INSERT, UPDATE, DELETE—that was my comfort zone. Anything beyond that felt like DBA territ...]]></description><link>https://highonbugs.sbk2k1.in/understanding-dbs</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/understanding-dbs</guid><category><![CDATA[DBMS]]></category><category><![CDATA[Oracle]]></category><category><![CDATA[SQL]]></category><category><![CDATA[orm]]></category><category><![CDATA[PL/SQL]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 24 Aug 2025 15:25:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756049003185/d944b237-6ee4-416c-80be-4b72bc59db05.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>When I started working with Oracle SQL, my relationship with databases was simple: they stored my data, and I fetched it when I needed it. <strong>SELECT, INSERT, UPDATE, DELETE</strong>—that was my comfort zone. Anything beyond that felt like DBA territory, not something I had to worry about. After all, I was writing application code, not tuning Oracle.</p>
<p>But reality caught up quickly. I noticed that most production issues weren’t because our Java or Python code was “bad.” They came from <strong>slow queries, unexpected locks, and mysterious database errors</strong>. The more I built, the clearer it became that treating the database as a “black box” was costing serious time. Debugging sessions stretched longer than necessary, partly due to both me and ChatGPT hallucinating.</p>
<p>Working with <strong>Oracle SQL and PL/SQL</strong> further solidified this realization. Stored procedures, triggers, functions—I used to think they were overkill. But I started to see how much performance I could gain (or lose) based on a single design decision. A well-placed index could save me hours of waiting, while one poorly written query could bring an entire system to its knees. (This did happen to me BTW!)</p>
<p>That’s when I realized: <strong>knowing databases only at the CRUD level is like driving a Ferrari in first gear</strong>. (I copied this line from ChatGPT…but it’s true) You’ll move, but you’ll never hit the performance or control the machine is capable of.</p>
<p>In this post, I’ll share how I broke out of the CRUD trap, what Oracle and my team taught me (both the good and the ugly), and why every developer—especially juniors like me—needs to go deeper into databases. Not to become a DBA, but to become a developer who actually understands the backbone of their application.</p>
<h2 id="heading-the-crud-trap-why-its-holding-you-back">The CRUD Trap: Why It’s Holding You Back</h2>
<p>When I first learnt to code with a database (yep, it was MongoDB), CRUD felt like enough. Create a row, read it, update it, delete it—that covers most app features, right? For a long time, I thought of the database as nothing more than a <strong>fancy storage box</strong>. As long as the app worked and the tests passed, I didn’t care how the database handled things internally.</p>
<p>That mindset led me straight into what I now call the <strong>CRUD trap</strong>.</p>
<h3 id="heading-the-database-as-dumb-storage-mentality">The “Database as Dumb Storage” Mentality</h3>
<p>MongoDB crippled my SQL abilities, so I leaned on ORMs (Object-Relational Mappers) like they were magic. They saved me from writing SQL, but they also blinded me to what was actually happening underneath. When a query was slow, I assumed the problem was my code—or worse, I thought “we’ll just scale with more servers later.” <strong>Spoiler: throwing hardware at bad queries doesn’t fix them.</strong></p>
<p>I also fell for the idea that <strong>ORMS eliminates the need to know SQL</strong>. They don’t. They just <em>hide</em> SQL from you until something breaks, and then you’re stuck staring at logs with queries you don’t understand.</p>
<h3 id="heading-the-real-costs-of-crud-only-knowledge">The Real Costs of CRUD-Only Knowledge</h3>
<p>Here’s what you can run into:</p>
<ul>
<li><p><strong>Performance disasters</strong>: An N+1 query slipped into production, generated by the ORM. Everything may look fine locally, but under load, it means disaster.</p>
</li>
<li><p><strong>Scaling nightmares</strong>: An app that worked fine with a few users fell apart when traffic grew. Suddenly, 1,000 concurrent users meant blocked queries and timeouts everywhere.</p>
</li>
<li><p><strong>Data corruption surprises</strong>: Without really understanding transactions or isolation levels, you may have race conditions that silently corrupt data. (These may be a bit easier to spot if you have your CS fundamentals on point)</p>
</li>
</ul>
<p>The <strong>CRUD trap</strong> isn’t just about writing basic queries—it’s about staying blind to how your database behaves under real-world conditions. Once I realized that, I knew I had to step up and learn the stuff I had been avoiding.</p>
<h2 id="heading-my-oracle-journey-lessons-from-the-trenches">My Oracle Journey: Lessons from the Trenches</h2>
<p>Working with <strong>Oracle SQL and PL/SQL</strong> was my first real push beyond the CRUD/ORM bubble. At first, it felt intimidating—suddenly I was writing stored procedures, handling triggers, and looking at execution plans that looked more like hieroglyphs than code. But over time, I began to see why people say <strong>“process data where it lives.”</strong></p>
<h3 id="heading-what-plsql-taught-me">What PL/SQL Taught Me</h3>
<ul>
<li><p><strong>Data-centric thinking</strong>: Instead of pulling thousands of rows into my application just to loop through them, I learned to let the database handle it in one go. A single well-written PL/SQL block could replace pages of application code. (I remember when I wrote a PL/SQL procedure that took 7 hours to complete. Cursors FTW!)</p>
</li>
<li><p><strong>Performance gains</strong>: By reducing round-trips between the app and database, I saw query times drop drastically. I didn’t fully appreciate network latency until I watched a job go from minutes to seconds.</p>
</li>
<li><p><strong>Complex business logic in one place</strong>: Multi-step operations—like validating input, updating multiple tables, and logging results—could all live inside a single transaction. That consistency was powerful. Especially in a complex system with different logically interdependent parts.</p>
</li>
</ul>
<h3 id="heading-the-dark-side-i-discovered">The Dark Side I Discovered</h3>
<p>Of course, it wasn’t all smooth sailing. For every win, there was a tradeoff.</p>
<ul>
<li><p><strong>Maintenance hell</strong>: Debugging stored procedures at 2 AM is no fun. Error messages weren’t always clear, and tracking down the cause of a failure deep inside a PL/SQL block was painful. (Not all PL/SQL procedures/packages written were very debuggable either)</p>
</li>
<li><p><strong>Vendor lock-in</strong>: The more business logic we pushed into Oracle-specific PL/SQL, the harder it became to even <em>think</em> about migrating to another database. I realized that too much reliance on proprietary features can paint you into a corner.</p>
</li>
<li><p><strong>Blurred responsibility</strong>: Business logic lived partly in the application and partly in the database. This made it harder for new team members to figure out where a certain rule was enforced.</p>
</li>
<li><p><strong>Synchronous Bottleneck:</strong> When you make a call to a PL/SQL procedure, it is typically a <strong>synchronous</strong> operation. Your application thread makes the call and then <em>blocks</em>, waiting for the entire operation to complete before it can do anything else. The entire time the database is doing this complex work (which could be CPU-intensive on the DB server), your application server is sitting idle, holding open a connection and waiting for a response. This increases <strong>response time</strong> for the end-user and <strong>ties up application resources</strong> (threads/connections) that could be serving other requests.</p>
</li>
</ul>
<p>Oracle forced me to see that <strong>databases aren’t just storage—they’re engines</strong>. But it also showed me the danger of leaning too heavily on them for everything. That balance—between application logic, ORM convenience, and database power—is something I’m still learning.</p>
<h2 id="heading-the-essential-database-skills-every-developer-needs">The Essential Database Skills Every Developer Needs</h2>
<p>Once I accepted that databases weren’t just “dumb storage,” I had to figure out <em>what to actually learn ( I still am!)</em>. The problem is that “<strong>database knowledge</strong>” is a bottomless pit—you could study internals for years. As a junior dev, I focus on the skills that have the biggest impact on my day-to-day work and on fixing real production issues.</p>
<p>Here’s what I found mattered most for me:</p>
<h3 id="heading-41-understanding-database-behavior-amp-performance">4.1 Understanding Database Behavior &amp; Performance</h3>
<p>One of the first “aha” moments I had was reading <strong>execution plans</strong>. Oracle showed me that what I thought was “just a simple query” could be doing a full table scan of millions of rows.</p>
<ul>
<li><p>Learn how your database <strong>fetches pages from disk, caches them, and reuses them</strong>. Suddenly, you’ll see why some queries are blazing fast and others crawl.</p>
</li>
<li><p>Execution plans are like <strong>X-rays for your queries</strong>—without them, you’re guessing.</p>
</li>
<li><p>Outdated statistics once made one of my queries 100x slower until I learned why the optimizer was making “bad” choices.</p>
</li>
</ul>
<h3 id="heading-42-transactions-amp-data-consistency">4.2 Transactions &amp; Data Consistency</h3>
<p>This part was fairly clear because of my college stuff—I knew the ACID acronym and what it meant.</p>
<ul>
<li><p>Understanding <strong>ACID</strong> beyond the acronym saved me from silent data corruption.</p>
</li>
<li><p>I learned that <strong>isolation levels</strong> are tradeoffs: “read committed” prevents some bugs but not all, and “serializable” can lock up performance if misused.</p>
</li>
<li><p>One of my best understandings was <strong>deadlock</strong>—two transactions waiting on each other forever. Learning why that happens gave me confidence I never had with databases before.</p>
</li>
</ul>
<h3 id="heading-43-indexes-your-performance-best-friend">4.3 Indexes: Your Performance Best Friend</h3>
<p>Indexes were another turning point. I thought they were just “something DBAs handled.” Then I saw how the right index dropped a query from <strong>minutes to milliseconds</strong>.</p>
<ul>
<li><p>Beyond the basics, <strong>composite indexes</strong> and Oracle’s <strong>bitmap indexes</strong> opened my eyes to the different tradeoffs.</p>
</li>
<li><p>But I also learned that <strong>indexes aren’t free</strong>—they can slow down <strong>inserts</strong> and <strong>updates</strong>. Balance matters.</p>
</li>
</ul>
<h3 id="heading-44-advanced-query-techniques">4.4 Advanced Query Techniques</h3>
<p>PL/SQL forced me to learn things I would have happily ignored:</p>
<ul>
<li><p><strong>Joins</strong> aren’t all the same—hash joins, nested loops, and merge joins behave differently at scale.</p>
</li>
<li><p><strong>Window functions</strong> felt like magic once I learned them—suddenly I didn’t need ugly cursor loops. (still a bit shaky with those. AI helps!)</p>
</li>
<li><p><strong>CTEs (WITH clauses)</strong> made my complex queries readable and maintainable.</p>
</li>
</ul>
<h2 id="heading-orms-the-double-edged-sword">ORMs: The Double-Edged Sword</h2>
<p>For a while, I thought learning SQL and PL/SQL meant I could throw away ORMs. I was wrong. In most modern stacks, ORMs are <strong>unavoidable</strong>. They’re the glue between application code and the database, and honestly, they’re a huge productivity boost. (Especially for people like me who come from NoSQL backgrounds) But I learnt they can also be dangerous if you treat them as magic.</p>
<h4 id="heading-why-orms-are-essential">Why ORMs Are Essential</h4>
<ul>
<li><p><strong>Developer productivity</strong>: As a junior, ORMs let me build features quickly without writing hundreds of lines of boilerplate SQL.</p>
</li>
<li><p><strong>Type safety and abstraction</strong>: I could write code in my primary language (JavaScript, Python, etc.) and let the ORM handle mapping objects to tables.</p>
</li>
<li><p><strong>Security benefits</strong>: ORMs protect against SQL injection out of the box—something beginners (like me at first) can easily get wrong.</p>
</li>
<li><p><strong>Rapid prototyping</strong>: For CRUD-heavy apps, nothing beats scaffolding models and having queries “just work.”</p>
</li>
</ul>
<h4 id="heading-the-problems-i-can-run-into">The Problems I can run into</h4>
<p>From looking around on the internet, I could see that ORMs were not bulletproof**.**</p>
<ul>
<li><p>You can get hit with the <strong>N+1 query problem</strong>: fetching one user triggered a separate query for each related row. In production, this can tank performance.</p>
</li>
<li><p>ORMs can sometimes generate <strong>inefficient queries</strong> that no human would ever write.</p>
</li>
<li><p><strong>Change tracking overhead</strong> can bloat memory usage in one of my apps.</p>
</li>
</ul>
<h4 id="heading-the-right-way-to-use-orms-what-i-am-going-to-do">The Right Way to Use ORMs (What I am going to do)</h4>
<ul>
<li><p><strong>Always monitor query generation</strong>: Log queries in dev mode so I can see what’s happening under the hood.</p>
</li>
<li><p><strong>Use raw SQL when needed</strong>: For complex reporting or batch updates, I’ll bypass the ORM. It’s not betrayal—it’s being practical.</p>
</li>
<li><p><strong>Batch operations</strong>: ORMs aren’t great at bulk inserts/updates—sometimes raw SQL or stored procedures are the better option.</p>
</li>
</ul>
<p>In short: <strong>ORMs aren’t the enemy, but they’re not a free pass either.</strong> The sweet spot is knowing enough SQL to understand and override your ORM when it misbehaves.</p>
<h2 id="heading-stored-procedures-vs-application-logic-vs-orms">Stored Procedures vs Application Logic vs ORMs</h2>
<p>One of the toughest lessons I’ve learned as a junior dev is that <strong>there’s no single “right” place for business logic</strong>. Sometimes it belongs in the application, sometimes in the database, and sometimes the ORM handles it just fine. The challenge is knowing <em>which tool to use and when</em>.</p>
<h4 id="heading-the-three-way-decision-matrix">The Three-Way Decision Matrix</h4>
<ul>
<li><p><strong>Simple CRUD operations</strong> → ORMs shine here. They save time, reduce boilerplate, and make your code easier to maintain. I don’t hand-write SQL just to fetch a user profile or do basic auth stuff anymore.</p>
</li>
<li><p><strong>Complex business logic</strong> → This usually belongs in the <strong>application layer</strong>. It’s easier to test, version control, and debug in code than in a giant PL/SQL procedure. I learned the hard way that debugging a 500-line stored procedure at 2 AM is a nightmare.</p>
</li>
<li><p><strong>Data-intensive operations</strong> → If you’re doing heavy aggregations or transformations, sometimes the <strong>database is the best place</strong>. Letting Oracle process millions of rows in one optimized query is far better than dragging that data into app code.</p>
</li>
<li><p><strong>Reporting &amp; analytics queries</strong> → Often best handled with <strong>raw SQL</strong> or database views. ORMs can’t always express these queries efficiently, and PL/SQL can make them too rigid.</p>
</li>
</ul>
<h4 id="heading-making-the-right-choice">Making the Right Choice</h4>
<p>When I was starting, I often defaulted to whatever was easiest: procedures for everything, or raw SQL in JAVA code when the team said so. Now I ask myself a few questions first:</p>
<ul>
<li><p><strong>Can my team maintain this?</strong> If the answer is no, then putting it in PL/SQL just because it’s “faster” is a bad idea.</p>
</li>
<li><p><strong>What are the performance requirements?</strong> For mission-critical paths (like payment processing), I consider stored procs or carefully optimized SQL.</p>
</li>
<li><p><strong>How complex is deployment?</strong> Changing app code is usually easier than deploying updated stored procedures in production.</p>
</li>
<li><p><strong>How will I test this?</strong> Application logic is easier to unit test. Database logic often requires integration testing.</p>
</li>
</ul>
<h4 id="heading-the-balance-im-learning">The Balance I’m Learning</h4>
<p>The truth is, it’s not about choosing one approach forever—it’s about <strong>balancing tradeoffs</strong>.</p>
<ul>
<li><p>ORMs are great for speed and safety, but can betray you if you don’t understand the SQL behind them.</p>
</li>
<li><p>Application logic is maintainable, but can be slower if you’re moving tons of data out of the database.</p>
</li>
<li><p>Stored procedures are powerful but can lock you into a vendor and create long-term maintenance pain.</p>
</li>
</ul>
<p>As a junior dev, I used to think the answer was to pick one and stick with it. Now I see that the real skill is <strong>knowing when to switch gears</strong>. There are countless approaches, and none are inherently wrong. The only thing is that there is always something better.</p>
<h2 id="heading-scaling-and-performance">Scaling and Performance</h2>
<p>At some point, everyone hits the wall where a query that felt “instant” during development now crawls when there are too many records. I learned the hard way that scaling a database isn’t just about throwing more hardware at it.</p>
<ul>
<li><p><strong>Query Design Matters</strong>: A poorly written query can break your system faster than a lack of RAM. Simple changes—like selecting only the fields you need instead of <code>SELECT *</code>—can massively cut down response times.</p>
</li>
<li><p><strong>Partitioning and Sharding</strong>: Once datasets get too large, you can’t keep everything in one neat table. Horizontal partitioning (sharding) and vertical partitioning (splitting tables by columns) are real tools, not just academic jargon.</p>
</li>
<li><p><strong>Connection Pooling</strong>: Early on, I let every request open a new connection. Unsurprisingly, the DB server keeled over under load. Pooling transformed my app’s stability.</p>
</li>
</ul>
<p>Performance tuning isn’t a one-time activity—it’s a continuous loop of observing, testing, and adjusting.</p>
<h2 id="heading-things-im-looking-forward-to-learning">Things I’m looking forward to learning</h2>
<p>I’ve mostly traditionally approached databases—writing SQL queries, procedures, and making schema changes directly. But I know that’s not sustainable when projects scale. What I want to explore next is the idea of <strong>treating the database like code</strong>.</p>
<p>Some areas I’ve only scratched the surface of (or haven’t explored yet) include:</p>
<ul>
<li><p><strong>Migration Frameworks</strong> – Tools like <strong>Liquibase</strong>, <strong>Flyway, Alembic</strong> are on my radar, but I haven’t used them. I want to understand how they handle upgrades, rollbacks, and CI/CD integration.</p>
</li>
<li><p><strong>Automated Database Builds</strong> – The idea of being able to spin up a database from scratch, seeded with data, using a single command, sounds powerful—and I want to get there.</p>
</li>
<li><p><strong>Cross-Team Collaboration</strong> – I’d like to learn how developers, DBAs, and DevOps folks actually coordinate database changes in practice when following this model.</p>
</li>
</ul>
<p>I suspect that once I start adopting these practices, my database skills will evolve from reactive fixes to deliberate design.</p>
<h2 id="heading-the-mindset-shift-from-crud-to-craft">The Mindset Shift: From CRUD to Craft</h2>
<p>So far, most of my work has been around <strong>CRUD operations</strong> and <strong>PL/SQL</strong>. But I realize databases can be so much more, and this is where I want to level up.</p>
<p>Things I haven’t explored yet, but want to:</p>
<ul>
<li><p><strong>Data Modeling for Scale</strong> – Normalization I know, but I haven’t dived into when to denormalize, how to structure schemas for analytics, or how modern systems like Postgres + JSONB balance relational and flexible storage.</p>
</li>
<li><p><strong>Security &amp; Access Control</strong> – I’ve relied on defaults so far, but I want to understand fine-grained roles, row-level security, and modern practices for multi-tenant applications.</p>
</li>
<li><p><strong>Event-Driven Databases</strong> – Triggers I know about, but I haven’t explored how databases can participate in event-driven architectures (e.g., Postgres + Kafka, CDC pipelines).</p>
</li>
<li><p><strong>Postgres &amp; Beyond</strong> – My experience is heavily Oracle-centric. I want to broaden into Postgres and explore what features I’ve been missing out on.</p>
</li>
</ul>
<p>For me, this mindset shift is about moving from just “getting the data out” to <strong>designing data systems intentionally</strong>—something I know I haven’t mastered yet, but want to.</p>
<hr />
<p>If you still have any queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[NewsCom: Unleashing Community Voices]]></title><description><![CDATA[What started as a simple project to learn GitHub Actions became a potentially powerful conduit for collaboration. NewsCom envisages diverse voices to converge to share insights, experiences, and stories. In this blog, we embark on a journey through a...]]></description><link>https://highonbugs.sbk2k1.in/newscom</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/newscom</guid><category><![CDATA[GitHub]]></category><category><![CDATA[community]]></category><category><![CDATA[React]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 21 Jan 2024 18:36:14 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1705862116141/faf9afa2-435a-4c8b-b10a-de9f09730fe5.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>What started as a simple project to learn GitHub Actions became a potentially powerful conduit for collaboration. NewsCom envisages diverse voices to converge to share insights, experiences, and stories. In this blog, we embark on a journey through a unique project's inception, development, and intricacies. This community-driven newsletter thrives on the collective wisdom of its contributors.</p>
<h2 id="heading-the-purpose-and-goals"><strong>The Purpose and Goals</strong></h2>
<p>At the heart of this initiative lies a simple yet profound purpose: to cultivate a space where tech gets written by and for techies at minimal setup. This community newsletter is not just a platform for disseminating information; it is a testament to the power of collaborative storytelling, where the richness of shared experiences transcends boundaries. It aims to create a space with a detached content contribution system all the while maintaining a certain degree of ownership.</p>
<h2 id="heading-github-authentication-a-pillar-of-security-and-engagement"><strong>GitHub Authentication: A Pillar of Security and Engagement</strong></h2>
<p>To maintain content ownership and maintain a secure environment, we've implemented user authentication through GitHub accounts. This streamlines the contribution process and adds an extra layer of trust and accountability to the community.</p>
<p>By leveraging GitHub's robust authentication system, we not only enhance the security of our platform but also seamlessly integrate with a vast network of developers and enthusiasts. This unique approach not only simplifies the registration process (more like signing, in our case) but also taps into the pre-existing GitHub community, creating a familiar and welcoming environment for users.</p>
<h2 id="heading-creating-and-submitting-articles-on-the-webpage"><strong>Creating and Submitting Articles on the Webpage</strong></h2>
<p>With the foundation of GitHub authentication laid out in the previous chapter, we now turn our attention to the heart of our community-driven newsletter project – the process of creating and submitting articles. Our dedicated webpage serves as a simple Markdown editor where contributors, armed with their GitHub identities, can weave their narratives and share their insights. Any GitHub User needs 5 steps to potentially contribute their first article:</p>
<ul>
<li><p>Go to <a target="_blank" href="https://newscom.sbk2k1.tech/">https://newscom.sbk2k1.tech/</a></p>
</li>
<li><p>Sign in with GitHub by clicking on "Write a Blog Yourself".</p>
</li>
<li><p>Authenticate GitHub App.</p>
</li>
<li><p>Click on "Write a Blog Yourself".</p>
</li>
<li><p>Write and Submit!</p>
</li>
</ul>
<p>As contributors craft their articles, the integration with GitHub ensures that each iteration is tracked and worked with seamlessly. Once an article takes shape and the contributor is satisfied, our platform facilitates the submission process through a straightforward interface. This can also foster a better understanding of Git and GitHub amongst developers.</p>
<h2 id="heading-pull-requests-and-collaborative-editing"><strong>Pull Requests and Collaborative Editing</strong></h2>
<p>In the collaborative ecosystem of our community newsletter, the editorial process takes center stage as repository collaborators assume the role of editors. GitHub pull requests (PRs) become the conduit through which individual contributions undergo scrutiny, refinement, and ultimately, integration into the collective narrative.  </p>
<p><strong>Collaborators as Editors</strong></p>
<p>Within our GitHub repository, a select group of individuals – our repository collaborators – take on the pivotal role of editors. Endowed with the responsibility of reviewing and curating content, these collaborators evaluate each pull request based on the project's editorial guidelines. Their expertise ensures that the newsletter maintains a high standard of quality, coherence, and relevance.</p>
<h2 id="heading-automation-with-github-actions">Automation with GitHub Actions</h2>
<p>GitHub Actions, a powerful workflow automation tool, becomes the silent orchestrator behind the scenes, ensuring that our community newsletter project runs seamlessly. This chapter explores the role of GitHub Actions in automating critical processes, from triggering events based on article collection milestones to compiling and distributing the newsletter to our eager subscribers.</p>
<h4 id="heading-triggering-actions-from-collection-to-compilation"><strong>Triggering Actions: From Collection to Compilation</strong></h4>
<p>One of the primary functions of GitHub Actions in our project is the automatic triggering of workflows based on predefined conditions. As articles accumulate in the main branch (collecting), a GitHub Action is set to activate when a certain number is reached. This marks the commencement of the compilation process, ensuring that the newsletter evolves organically.</p>
<h4 id="heading-email-notifications-connecting-with-subscribers"><strong>Email Notifications: Connecting with Subscribers</strong></h4>
<p>With the compiled newsletter in hand, GitHub Actions takes the next step by automating the distribution process. Subscribers, eagerly awaiting the latest edition, receive automated email notifications. This ensures timely and consistent delivery, enhancing the overall user experience and engagement.</p>
<h4 id="heading-cleanup-and-cloudinary-backup"><strong>Cleanup and Cloudinary Backup</strong></h4>
<p>In the spirit of meticulous housekeeping, GitHub Actions goes a step further by cleaning up the main branch post-compilation. This ensures a fresh slate for the next collection cycle. Simultaneously, a backup process sends the compiled files to Cloudinary, providing a secure archive for previous issues. This strategic backup strategy adds an extra layer of protection against data loss.</p>
<p><strong><em>Note: Currently a user can write only 1 article per day.</em></strong></p>
<h2 id="heading-future-decisions">Future Decisions</h2>
<p>Currently, the project has quite a few shortcomings. The front end and UI are <strong>not the best</strong>. The IDE and the overall feel are not very user-friendly. Moreover, there is no subscribe and publish feature, since they will need cloud-hosted services (at least the database). I'll push these features when there are <strong><em>at least 50 article submissions</em></strong>.</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>The project is centered around minimal setup and simple workflow. I'll try to keep improving on that. Careful curation of articles is something I think I'd have to focus a lot on in terms of upscaling.</p>
<h2 id="heading-technologies-used">Technologies used</h2>
<ul>
<li><p><strong>GitHub Actions:</strong> Used for automating workflows, such as triggering events based on article collection milestones, compiling Markdown articles, and distributing the newsletter.</p>
</li>
<li><p><strong>Node.js:</strong> Backend proxy, handling publish, etc etc</p>
</li>
<li><p><strong>React.js:</strong> The website</p>
</li>
<li><p><strong>Octokit:</strong> A JavaScript toolkit for the GitHub API. It facilitates communication with the GitHub API, allowing seamless integration and interaction with GitHub features.</p>
</li>
<li><p><strong>GitHub API and OAuth:</strong> GitHub API is likely used for accessing and manipulating GitHub data, while OAuth is employed for secure and standardized user authentication using GitHub accounts.</p>
</li>
<li><p><strong>GitHub Version Control:</strong> Inherent to the GitHub platform, version control is fundamental for tracking changes, managing branches, and ensuring the integrity of the project's codebase.</p>
</li>
</ul>
<hr />
<ul>
<li><p>Website: <a target="_blank" href="https://newscom.sbk2k1.tech/">https://newscom.sbk2k1.tech/</a></p>
</li>
<li><p>GitHub Repo: <a target="_blank" href="https://github.com/High-on-Bugs/HoB-Community-Newsletter">https://github.com/High-on-Bugs/HoB-Community-Newsletter</a></p>
</li>
</ul>
<p>If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[How to deploy your Website to GitHub Pages using GitHub Actions]]></title><description><![CDATA[Step 1: Organize Files
Place your HTML file, CSS file, and assets/js in a dedicated folder. Let's call it website.
The folder structure should look something like this.
  ├── .github/workflows/deploy.yml     # we'll talk about this later    
  ├── /W...]]></description><link>https://highonbugs.sbk2k1.in/how-to-deploy-your-website-to-github-pages-using-github-actions</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/how-to-deploy-your-website-to-github-pages-using-github-actions</guid><category><![CDATA[github-actions]]></category><category><![CDATA[ci-cd]]></category><category><![CDATA[GitHubPages]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Fri, 19 Jan 2024 09:03:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1705650059692/0f2465a1-2172-4bcc-a826-e5e2131c046f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-step-1-organize-files"><strong>Step 1: Organize Files</strong></h2>
<p>Place your HTML file, CSS file, and assets/js in a dedicated folder. Let's call it <code>website</code>.</p>
<p>The folder structure should look something like this.</p>
<pre><code class="lang-bash">  ├── .github/workflows/deploy.yml     <span class="hljs-comment"># we'll talk about this later    </span>
  ├── /Website/                        <span class="hljs-comment"># All your website stuff</span>
        ├── HTML
        ├── CSS
        ├── Js
        ├── /Other Assets/   
  ├── Readme.md                        <span class="hljs-comment"># optional - not necessary</span>
</code></pre>
<h2 id="heading-step-2-understanding-github-actions"><strong>Step 2: Understanding GitHub Actions</strong></h2>
<p>GitHub Actions is a powerful automation and continuous integration (CI) tool seamlessly integrated into the GitHub platform.</p>
<p>Utilizing declarative YAML configurations, GitHub Actions allows developers to define workflows—automated processes triggered by various events such as code pushes or pull requests.</p>
<p>These workflows consist of jobs and steps, where each step represents a task, such as building, testing, or deploying code.</p>
<p>Check out this video to understand it better:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=mFFXuXjVgkU">https://www.youtube.com/watch?v=mFFXuXjVgkU</a></div>
<p> </p>
<h2 id="heading-step-3-create-github-action-workflow"><strong>Step 3: Create GitHub Action Workflow</strong></h2>
<p>Create a new folder named <code>.github/workflows</code> in the root of your repository, and inside it, create a file named <code>deploy.yaml</code>. This YAML file will define your GitHub Actions workflow.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">to</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Pages</span>

<span class="hljs-attr">on:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">branches:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">master</span> <span class="hljs-comment"># change to your main branch</span>

<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">deploy:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">permissions:</span>
      <span class="hljs-attr">contents:</span> <span class="hljs-string">write</span>
    <span class="hljs-attr">concurrency:</span>
      <span class="hljs-attr">group:</span> <span class="hljs-string">${{</span> <span class="hljs-string">github.workflow</span> <span class="hljs-string">}}-${{</span> <span class="hljs-string">github.ref</span> <span class="hljs-string">}}</span>

    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkout</span> <span class="hljs-string">Repository</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v2</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">ref:</span> <span class="hljs-string">master</span>   <span class="hljs-comment"># again change with your main branch</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Init</span> <span class="hljs-string">new</span> <span class="hljs-string">repo</span> <span class="hljs-string">in</span> <span class="hljs-string">website</span> <span class="hljs-string">folder</span> <span class="hljs-string">and</span> <span class="hljs-string">commit</span> <span class="hljs-string">generated</span> <span class="hljs-string">files</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">|
          cd ./Website
          git init
          git add .
          git config --local user.email "bhattacharyasaptarshi2001@gmail.com"
          git config --local user.name "Saptarshi"
          git commit -m 'deploy'
</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Add</span> <span class="hljs-string">safe.directory</span> <span class="hljs-string">exception</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">git</span> <span class="hljs-string">config</span> <span class="hljs-string">--global</span> <span class="hljs-string">--add</span> <span class="hljs-string">safe.directory</span> <span class="hljs-string">/github/workspace/Website</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Force</span> <span class="hljs-string">push</span> <span class="hljs-string">to</span> <span class="hljs-string">destination</span> <span class="hljs-string">branch</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">ad-m/github-push-action@master</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">github_token:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.GITHUB_TOKEN</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">branch:</span> <span class="hljs-string">gh-pages</span>
          <span class="hljs-attr">force:</span> <span class="hljs-literal">true</span>
          <span class="hljs-attr">directory:</span> <span class="hljs-string">./Website</span>
</code></pre>
<p><em>Let's understand each part and what it does.</em></p>
<h3 id="heading-workflow-name-and-trigger"><strong>Workflow Name and Trigger:</strong></h3>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">Deploy</span> <span class="hljs-string">to</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Pages</span>

<span class="hljs-attr">on:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">branches:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">master</span> <span class="hljs-comment"># Change to your main branch</span>
</code></pre>
<ul>
<li><p><code>name:</code>: Defines the name of the GitHub Actions workflow. In this case, it's named "Deploy to GitHub Pages."</p>
</li>
<li><p><code>on:</code>: Specifies the events that trigger the workflow. Here, the workflow runs on every push to the specified branches, in this case, the <code>master</code> branch.</p>
</li>
</ul>
<h3 id="heading-job-configuration"><strong>Job Configuration:</strong></h3>
<pre><code class="lang-yaml"><span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">deploy:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">permissions:</span>
      <span class="hljs-attr">contents:</span> <span class="hljs-string">write</span>
    <span class="hljs-attr">concurrency:</span>
      <span class="hljs-attr">group:</span> <span class="hljs-string">${{</span> <span class="hljs-string">github.workflow</span> <span class="hljs-string">}}-${{</span> <span class="hljs-string">github.ref</span> <span class="hljs-string">}}</span>
</code></pre>
<ul>
<li><p><code>runs-on:</code>: Specifies the GitHub-hosted runner environment for the job. Here, it's set to run on the latest version of Ubuntu.</p>
</li>
<li><p><code>permissions:</code>: Grants write permissions to the contents of the repository for this job.</p>
</li>
<li><p><code>concurrency:</code>: Helps manage concurrent workflow runs by grouping them based on the workflow name and branch. This can prevent race conditions when deploying.</p>
</li>
</ul>
<h3 id="heading-steps"><strong>Steps:</strong></h3>
<pre><code class="lang-yaml"><span class="hljs-attr">steps:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkout</span> <span class="hljs-string">Repository</span>
    <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v2</span>
    <span class="hljs-attr">with:</span>
      <span class="hljs-attr">ref:</span> <span class="hljs-string">master</span>   <span class="hljs-comment"># Change with your main branch</span>
</code></pre>
<ul>
<li><code>actions/checkout@v2:</code>: Action that checks out the repository at the specified <code>ref</code> (branch). In this case, it's checking out the <code>master</code> branch.</li>
</ul>
<pre><code class="lang-yaml">    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Init</span> <span class="hljs-string">new</span> <span class="hljs-string">repo</span> <span class="hljs-string">in</span> <span class="hljs-string">the</span> <span class="hljs-string">website</span> <span class="hljs-string">folder</span> <span class="hljs-string">and</span> <span class="hljs-string">commit</span> <span class="hljs-string">generated</span> <span class="hljs-string">files</span>
    <span class="hljs-attr">run:</span> <span class="hljs-string">|
      cd ./Website
      git init
      git add .
      git config --local user.email "bhattacharyasaptarshi2001@gmail.com"
      git config --local user.name "Saptarshi"
      git commit -m 'deploy'</span>
</code></pre>
<ul>
<li><p><code>run:</code>: Executes a series of shell commands.</p>
</li>
<li><p><code>cd ./Website:</code>: Changes the working directory to the <code>./Website</code> folder.</p>
</li>
<li><p><code>git init:</code>: Initializes a new Git repository.</p>
</li>
<li><p><code>git add .:</code>: Adds all files in the current directory to the staging area.</p>
</li>
<li><p><code>git config:</code>: Sets local Git configurations for the user's email and name.</p>
</li>
<li><p><code>git commit:</code>: Commits the changes with the message 'deploy.'</p>
</li>
</ul>
<pre><code class="lang-yaml">    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Add</span> <span class="hljs-string">safe.directory</span> <span class="hljs-string">exception</span>
    <span class="hljs-attr">run:</span> <span class="hljs-string">git</span> <span class="hljs-string">config</span> <span class="hljs-string">--global</span> <span class="hljs-string">--add</span> <span class="hljs-string">safe.directory</span> <span class="hljs-string">/github/workspace/Website</span>
</code></pre>
<ul>
<li><code>git config --global --add</code> <a target="_blank" href="http://safe.directory"><code>safe.directory</code></a> <code>/github/workspace/Website:</code>: Adds an exception for the <a target="_blank" href="http://safe.directory"><code>safe.directory</code></a> configuration globally to include the <code>./Website</code> folder.</li>
</ul>
<pre><code class="lang-yaml">    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Force</span> <span class="hljs-string">push</span> <span class="hljs-string">to</span> <span class="hljs-string">the</span> <span class="hljs-string">destination</span> <span class="hljs-string">branch</span>
    <span class="hljs-attr">uses:</span> <span class="hljs-string">ad-m/github-push-action@master</span>
    <span class="hljs-attr">with:</span>
      <span class="hljs-attr">github_token:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.GITHUB_TOKEN</span> <span class="hljs-string">}}</span>
      <span class="hljs-attr">branch:</span> <span class="hljs-string">gh-pages</span>
      <span class="hljs-attr">force:</span> <span class="hljs-literal">true</span>
      <span class="hljs-attr">directory:</span> <span class="hljs-string">./Website</span>
</code></pre>
<ul>
<li><p><code>ad-m/github-push-action@master:</code>: Utilizes the GitHub Push Action to force push changes to a specified branch.</p>
</li>
<li><p><code>github_token:</code>: Uses the repository's GitHub token stored in secrets for authentication.</p>
</li>
<li><p><code>branch: gh-pages:</code>: Specifies the branch to which the changes will be forcefully pushed.</p>
</li>
<li><p><code>force: true:</code>: Enables force pushing, and overwriting existing history on the <code>gh-pages</code> branch.</p>
</li>
<li><p><code>directory: ./Website:</code>: Specifies the directory from which to push the changes.</p>
</li>
</ul>
<h2 id="heading-step-4-create-gh-pages-branch"><strong>Step 4: Create</strong> <code>gh-pages</code> <strong>Branch</strong></h2>
<p>Create a new branch named <code>gh-pages</code>. GitHub automatically detects this branch and deploys its content to GitHub Pages. You can create the Branch for the GitHub Repository from the website You can also manually set it in the repository settings</p>
<p>The commands are:</p>
<pre><code class="lang-powershell">git checkout <span class="hljs-literal">-b</span> gh<span class="hljs-literal">-pages</span>
git push origin gh<span class="hljs-literal">-pages</span>
</code></pre>
<p>Go to <code>https://github.com/sbk2k1/&lt;repository_name&gt;/settings/pages</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1705652114622/4346411a-94f6-4a59-87ec-c0cb44dc2a05.png" alt class="image--center mx-auto" /></p>
<p>And you can manually configure it as well</p>
<h2 id="heading-step-5-set-github-token"><strong>Step 5: Set GitHub Token</strong></h2>
<p>In your repository settings, go to <strong><em>"Settings"</em></strong> &gt; <strong><em>"Actions"</em></strong> &gt; <strong><em>"General"</em></strong>. Scroll down to the <strong><em>"Workflow permissions"</em></strong> section and select the Read and Write permissions options. Save it.</p>
<h2 id="heading-step-6-remove-custom-domain-cname-optional"><strong>Step 6: Remove Custom Domain CNAME (Optional)</strong></h2>
<p>If you want to remove a custom domain from your GitHub Pages site (assuming you previously set it up and don't want it anymore), go to the URL below</p>
<p><code>https://github.com/USERNAME/USERNAME.github.io/blob/master/CNAME</code></p>
<p>And remove the CNAME inside the file.</p>
<p>Now, commit and push these changes, and your GitHub Actions workflow will automatically deploy your static page to GitHub Pages on the <code>gh-pages</code> branch!</p>
<hr />
<p>Repository with the code and setup: <a target="_blank" href="https://github.com/High-on-Bugs/Deploy-Wesbite-using-Actions-and-Pages">https://github.com/High-on-Bugs/Deploy-Wesbite-using-Actions-and-Pages</a></p>
<p>GitHub Issue discussing expired domain problem: <a target="_blank" href="https://github.com/isaacs/github/issues/1213">https://github.com/isaacs/github/issues/1213</a></p>
<hr />
<p>If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Deploy your TypeScript Express App to Vercel (2024)]]></title><description><![CDATA[Disclaimer: This blog does not discuss express and how to build server logic. This only focuses on deploying the app to Vercel as a Serverless Function.
Step 1: Export app instead of listening on a certain port.
Export the app in ES6 fashion rather t...]]></description><link>https://highonbugs.sbk2k1.in/typescript-express-vercel-deploy</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/typescript-express-vercel-deploy</guid><category><![CDATA[TypeScript]]></category><category><![CDATA[Express]]></category><category><![CDATA[Vercel]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Thu, 18 Jan 2024 15:22:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1705589155792/77d6ed7a-36f4-45ee-9b5d-f81a20dd9e46.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong><em>Disclaimer: This blog does not discuss express and how to build server logic. This only focuses on deploying the app to Vercel as a Serverless Function.</em></strong></p>
<h3 id="heading-step-1-export-app-instead-of-listening-on-a-certain-port">Step 1: Export <code>app</code> instead of listening on a certain port.</h3>
<p>Export the <code>app</code> in ES6 fashion rather than <code>app.listen()</code></p>
<p><strong><em>This</em></strong></p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> app;
</code></pre>
<p><strong><em>Instead of</em></strong></p>
<pre><code class="lang-typescript">app.listen(PORT, <span class="hljs-function">() =&gt;</span> {
    <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Server listening on port"</span>, PORT);
});
</code></pre>
<h3 id="heading-step-2-create-an-api-folder-for-vercel-and-set-it-up">Step 2: Create an <code>api</code> folder for Vercel and set it up.</h3>
<p>Create an <code>api</code> folder that has an <code>index.ts</code> as follows:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> app <span class="hljs-keyword">from</span> <span class="hljs-string">'../app'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> app;
</code></pre>
<p>This imports the app from your root directory (change the path if you have a different setup) and exports it for Vercel.</p>
<h3 id="heading-step-3-mention-the-api-folder-in-tsconfigjson">Step 3: Mention the API folder in <code>tsconfig.json</code></h3>
<p>Update <code>tsconfig.json</code> as follows to track the <code>api</code> folder</p>
<pre><code class="lang-typescript">{
  <span class="hljs-string">"compilerOptions"</span>: {
    <span class="hljs-string">"module"</span>: <span class="hljs-string">"commonjs"</span>,
    <span class="hljs-string">"esModuleInterop"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-string">"target"</span>: <span class="hljs-string">"es6"</span>,
    <span class="hljs-string">"rootDir"</span>: <span class="hljs-string">"./"</span>,
    <span class="hljs-string">"outDir"</span>: <span class="hljs-string">"build"</span>,
    <span class="hljs-string">"strict"</span>: <span class="hljs-literal">true</span>
  },
  <span class="hljs-string">"include"</span>: [<span class="hljs-string">"./api/*.ts"</span>] <span class="hljs-comment">// -&gt; this is the line you need to update</span>
}
</code></pre>
<h3 id="heading-step-4-create-the-public-folder">Step 4: Create the Public folder</h3>
<p>Create an empty <code>Public</code> folder because Vercel looks for it during deployment. We need to keep it even though no static files are to be served.</p>
<p>Create a <code>.gitkeep</code> to track the folder</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1705590358609/034a47e9-4255-41b9-956c-f43fadf11a0a.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-step-5-create-verceljson-file">Step 5: Create <code>vercel.json</code> file</h3>
<p>The <code>vercel.json</code> should look like this</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"rewrites"</span>: [
        {
            <span class="hljs-attr">"source"</span>: <span class="hljs-string">"/(.*)"</span>,
            <span class="hljs-attr">"destination"</span>: <span class="hljs-string">"/api"</span>
        }
    ]
}
</code></pre>
<p>This directs any API request received anywhere in the app is redirected to the <code>/api</code> folder. The <code>/api</code> folder uses the <code>/api/index.ts</code> which in turn uses the <code>app.ts</code> and on and on and on</p>
<h3 id="heading-step-6-rewrite-the-build-command-in-packagejson">Step 6: Rewrite the Build Command in <code>package.json</code></h3>
<p>Vercel handles all the transpilation, so we need to overwrite the <code>build</code> command in <code>package.json</code>. Create a new script called <code>vercel-buiild</code> which prevents the typescript compiler from being invoked and instead acts as a dummy placeholder. The script should look like this.</p>
<pre><code class="lang-json"><span class="hljs-string">"vercel-build"</span>: <span class="hljs-string">"echo hello"</span>,
</code></pre>
<h3 id="heading-step-7-deploy">Step 7: Deploy</h3>
<p>The app is now deployable and you can do it from the Vercel console by connecting your GitHub account.</p>
<p>If you have Vercel CLI installed you can check it on your local machine using the command</p>
<pre><code class="lang-bash">vercel dev
</code></pre>
<p>and deploy using</p>
<pre><code class="lang-bash">vercel
</code></pre>
<hr />
<p>My Code: <a target="_blank" href="https://github.com/High-on-Bugs/typescript-express-vercel-tutorial">https://github.com/High-on-Bugs/typescript-express-vercel-tutorial</a></p>
<p>Check this video for a video tutorial:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=B-T69_VP2Ls">https://www.youtube.com/watch?v=B-T69_VP2Ls</a></div>
<p> </p>
<hr />
<p>If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Setting Up My Simple Home Server: A Practical Guide]]></title><description><![CDATA[Chapter 1: Formatting the Old PC
Introduction
Welcome to the kickoff of my home server project! Imagine this: this forgotten PC, originally my dad's ex-workstation, is sitting idle. Armed with an Intel i3 8th gen processor, no flashy GPU, and a humbl...]]></description><link>https://highonbugs.sbk2k1.in/setting-up-my-home-server</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/setting-up-my-home-server</guid><category><![CDATA[Ubuntu]]></category><category><![CDATA[debian]]></category><category><![CDATA[server]]></category><category><![CDATA[NAS storage solutions]]></category><category><![CDATA[media]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 14 Jan 2024 10:24:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1705227618720/4fd18848-3721-46d1-b931-686e85314423.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-chapter-1-formatting-the-old-pc"><strong>Chapter 1: Formatting the Old PC</strong></h2>
<h3 id="heading-introduction"><strong>Introduction</strong></h3>
<p>Welcome to the kickoff of my home server project! Imagine this: this forgotten PC, originally my dad's ex-workstation, is sitting idle. Armed with an Intel i3 8th gen processor, no flashy GPU, and a humble 8 gigs of RAM, I saw potential in turning it into a nifty home server.</p>
<p><strong>The Appeal of Repurposing</strong></p>
<p>Repurposing would let me use the home server to back up files, stream movies and media, deploy my projects, and learn some other cool stuff</p>
<h3 id="heading-backing-up-data"><strong>Backing Up Data</strong></h3>
<p><strong>Importance of Data Backup</strong></p>
<p>There were not a lot of files to be backed up. So I used Google Drive to back a few of them up. The rest would get purged.</p>
<h2 id="heading-chapter-2-clean-slate"><strong>Chapter 2: Clean Slate</strong></h2>
<p><strong>The Decision to Start Fresh</strong></p>
<p>The chapter culminates in the decision to wipe the slate clean by formatting the existing Windows 10 installation. Reasons behind this choice include eliminating unnecessary clutter, ensuring a fresh start, and optimizing the system for its new role as a home server.</p>
<p>As there was a single SSD, with multiple partitions, I combined them into one and went ahead with the Linux Installation.</p>
<p>This video should be enough to get the steps on how to reset your Windows 10/11 machine:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=5OVwQdfUztU">https://www.youtube.com/watch?v=5OVwQdfUztU</a></div>
<p> </p>
<h2 id="heading-chapter-3-journey-to-ubuntu-installation"><strong>Chapter 3: Journey to Ubuntu Installation</strong></h2>
<p>For Video guidance, you can use this video:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=oNEwEQ0uU1Y">https://www.youtube.com/watch?v=oNEwEQ0uU1Y</a></div>
<p> </p>
<h3 id="heading-flashing-the-drive"><strong>Flashing the Drive</strong></h3>
<p>With the old PC prepped and ready, it was time to introduce it to its new companion - a shiny USB drive. I chose to use Balena Etcher for this job, a tool that makes flashing the drive a breeze. A few clicks, and we were set to roll.</p>
<h3 id="heading-navigating-the-bios"><strong>Navigating the BIOS</strong></h3>
<p>Ah, the backstage BIOS pass to your PC's inner workings. Before diving into the installation, a pitstop in the BIOS was essential. Adjusting settings, ensuring compatibility, and making sure the USB drive took center stage in the boot order.</p>
<h3 id="heading-setting-boot-priority"><strong>Setting Boot Priority</strong></h3>
<p>The boot priority dance was next on the agenda. I wanted the system to look at the USB drive first, ensuring a smooth transition from the flashing process to the Ubuntu installation.</p>
<h3 id="heading-installation-initiated"><strong>Installation Initiated</strong></h3>
<p>With the USB drive ready and the boot priority set, it was time for the main event - installing Ubuntu. The familiar installation wizard guided me through the process, prompting me for language preferences, time zone settings, and user details. A few clicks later, the installation was underway.</p>
<h3 id="heading-no-ethernet-cable-no-problem"><strong>No Ethernet Cable, No Problem</strong></h3>
<p>Ah, the hiccup. No Ethernet cable on hand meant no updated packages during installation. But worry not; I opted for a minimal install to keep things straightforward. I would need to set up my internet later.</p>
<p>And there you have it - Chapter 3! The USB drive is flashed, the BIOS is in check, and Ubuntu is making its home on the old PC. Stay tuned for the next chapter, where we tackle connection challenges and navigate the intricacies of USB tethering for internet access during and after installation. 🌐</p>
<h2 id="heading-chapter-4-usb-tethering-to-mobile-phone"><strong>Chapter 4: USB Tethering to Mobile Phone</strong></h2>
<h3 id="heading-overcoming-connection-challenges-during-installation"><strong>Overcoming Connection Challenges During Installation</strong></h3>
<p>As the Ubuntu installation progressed, a familiar hurdle emerged – the absence of an Ethernet connection. No Ethernet, no problem - but a solution was in order. This chapter delves into the challenges faced and the journey toward utilizing USB tethering to a mobile phone as the savior.</p>
<h3 id="heading-introduction-to-usb-tethering"><strong>Introduction to USB Tethering</strong></h3>
<p>With no Ethernet cable in sight, the spotlight turned to USB tethering. This technology, often underutilized, allows a seamless connection between a computer and a mobile device, effectively transforming the mobile phone into a gateway to the digital realm.</p>
<h3 id="heading-setting-up-ip-and-gateway-via-usb-tethering"><strong>Setting Up IP and Gateway via USB Tethering</strong></h3>
<p><strong>The Failed Attempt with Manual Net Tools Installation</strong></p>
<p>Initially, I attempted a manual installation of net tools from the official Ubuntu website. The process involved mounting the tools on a USB drive, but alas, it proved to be a cumbersome task and didn't yield the desired results. It was time to pivot.</p>
<p><strong>Embracing USB Tethering</strong></p>
<p>Enter USB tethering - a more straightforward and reliable solution. Connecting the mobile phone to the PC via USB cable initiated the tethering process. A quick check using <code>ip link</code> revealed the available network interfaces, among which the USB-tethered interface took center stage.</p>
<p><strong>Configuring IP and Gateway Settings</strong></p>
<p>The next step involved setting up the IP address and gateway for the USB-tethered interface. A judicious choice of values ensured a stable internet connection</p>
<p>Let's go through the steps:</p>
<ol>
<li><p>Check available network interfaces</p>
<pre><code class="lang-bash"> ip link
</code></pre>
</li>
<li><p>Recognize the USB interface and assign an IP address manually. Assuming IP as <code>192.168.42.10/24</code> and interface as <code>enp0s20u1</code></p>
<pre><code class="lang-bash"> sudo ip addr add 192.168.42.10/24 dev enp0s20u1
 sudo ip link <span class="hljs-built_in">set</span> enp0s20u1 up
</code></pre>
</li>
<li><p>Set the default gateway using the following command</p>
<pre><code class="lang-bash"> sudo ip route add default via 192.168.42.1 dev enp0s20u1
</code></pre>
</li>
<li><p>Reboot</p>
<pre><code class="lang-bash"> reboot
</code></pre>
</li>
<li><p>Try to ping an external IP to confirm if the connection works.</p>
<pre><code class="lang-bash"> ping 8.8.8.8
</code></pre>
</li>
</ol>
<p><em>In case of errors, check for Firewall rules</em></p>
<h3 id="heading-achieving-internet-access"><strong>Achieving Internet Access</strong></h3>
<p>With the USB tethering setup, the old PC was now online. Internet access meant that the installation process could proceed without any hiccups.</p>
<p><strong>Updating Linux and Adding Net Tools</strong></p>
<p>The newfound internet connection was immediately put to use. The system underwent a thorough update, ensuring that the latest Linux packages were on board. Additionally, the previously elusive net tools were seamlessly added to the arsenal, simplifying future networking tasks.</p>
<h2 id="heading-chapter-5-rtl8812au-driver-woes"><strong>Chapter 5: RTL8812AU Driver Woes</strong></h2>
<p><em>Device Name: TP-Link Archer T2U (RTL8812AU)</em></p>
<h3 id="heading-dealing-with-the-absence-of-an-official-driver"><strong>Dealing with the Absence of an Official Driver</strong></h3>
<p>The path to a fully functional home server encountered a significant hurdle when the built-in Wi-Fi adapter, the TP-Link RTL8812AU, found no official Linux support. Undeterred, I delved into the challenge, determined to find a workaround.</p>
<h3 id="heading-discovering-community-wisdom-on-mint-forum"><strong>Discovering Community Wisdom on Mint Forum</strong></h3>
<p>A ray of hope emerged when I stumbled upon a <a target="_blank" href="https://forums.linuxmint.com/viewtopic.php?t=307023">community post on the Mint forum</a>. Fellow enthusiasts faced similar RTL8812AU woes. The community post led me to a GitHub repository housing a solution to my driver predicament. The repository contained not only the necessary driver files but also somewhat detailed instructions in the README file.</p>
<h3 id="heading-installing-the-driver-using-readme-instructions"><strong>Installing the Driver Using Readme Instructions</strong></h3>
<p>After Cloning the repository, I followed the step-by-step instructions from the README file to install the drivers. Commands were entered, configurations were adjusted, and dependencies were resolved, all in alignment with the provided steps. With the driver installed, a quick network restart was in order. The moment of truth arrived as I eagerly awaited the appearance of the Wi-Fi interface, signaling that the RTL8812AU driver had successfully integrated with the system.</p>
<p>The Driver and the steps to install can be found <a target="_blank" href="https://github.com/aircrack-ng/rtl8812au">here</a>.</p>
<h3 id="heading-setting-up-netplan-to-configure-networking"><strong>Setting Up Netplan to Configure Networking</strong></h3>
<p>I then configured the network settings using Netplan. This involved crafting a Netplan YAML file, specifying DHCP settings, and nameservers, and providing login credentials for the network. Executing the Netplan YAML file applied the configurations to the system. The digital gears clicked into place as the network settings were adjusted, paving the way for a robust and stable connection.</p>
<p>With Netplan's configurations in place, the once elusive Wi-Fi connection now offered a gateway to the internet. The old PC was now fully equipped, ready to explore the digital landscape and fulfill its role as a reliable home server.</p>
<p>I used this video to get my wifi to work:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=Dacn58kgMXA">https://www.youtube.com/watch?v=Dacn58kgMXA</a></div>
<p> </p>
<p>My netplan configuration yaml looked as follows:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">network:</span>
  <span class="hljs-attr">ethernets:</span>
    <span class="hljs-attr">enp3s0:</span>
      <span class="hljs-attr">optional:</span> <span class="hljs-literal">true</span>
      <span class="hljs-attr">dhcp4:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">usb0:</span>
      <span class="hljs-attr">dhcp4:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">version:</span> <span class="hljs-number">2</span>
  <span class="hljs-attr">wifis:</span>
    <span class="hljs-string">&lt;interface_name&gt;:</span>
      <span class="hljs-attr">dhcp4:</span> <span class="hljs-literal">no</span>
      <span class="hljs-attr">addresses:</span> [<span class="hljs-string">&lt;static_ip_you_want_to_assign&gt;/24</span>]
      <span class="hljs-attr">routes:</span>
       <span class="hljs-bullet">-</span> <span class="hljs-attr">to:</span> <span class="hljs-string">default</span>
         <span class="hljs-attr">via:</span> <span class="hljs-string">&lt;gateway&gt;</span>
      <span class="hljs-attr">nameservers:</span>
        <span class="hljs-attr">addresses:</span> [<span class="hljs-number">1.1</span><span class="hljs-number">.1</span><span class="hljs-number">.1</span>, <span class="hljs-number">1.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span>]
      <span class="hljs-attr">access-points:</span>
        <span class="hljs-string">&lt;Network_name_SSID&gt;:</span>
          <span class="hljs-attr">password:</span> <span class="hljs-string">&lt;password&gt;</span>
</code></pre>
<h2 id="heading-chapter-6-unlocking-advanced-capabilities"><strong>Chapter 6: Unlocking Advanced Capabilities</strong></h2>
<p>As the home server project unfolded, the quest for enhanced functionalities led to the introduction of several powerful features, turning the old PC into a versatile hub for various applications.</p>
<h3 id="heading-sambashare-for-nas"><strong>SambaShare for NAS</strong></h3>
<p>Embracing the concept of Network-Attached Storage (NAS), the server now boasts SambaShare integration. This enables seamless file sharing across devices within the network. Whether it's documents, media files, or backups, the NAS capabilities provide a centralized repository accessible from any connected device.</p>
<p>Check this video out for help with installation:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=0-T7af_lRF8&amp;t=945s">https://www.youtube.com/watch?v=0-T7af_lRF8&amp;t=945s</a></div>
<p> </p>
<h3 id="heading-plex-for-media-streaming"><strong>Plex for Media Streaming</strong></h3>
<p>The entertainment dimension received a significant upgrade with the integration of Plex. Now, the server doubles as a media streaming powerhouse. Plex not only organizes the media library but also enables streaming on-demand, turning the old PC into a personal media center.</p>
<p>Check this video out for help with installation:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=QEP5Tq78cHw">https://www.youtube.com/watch?v=QEP5Tq78cHw</a></div>
<p> </p>
<h3 id="heading-docker-minikube-and-openssh-for-remote-development"><strong>Docker, Minikube, and OpenSSH for Remote Development</strong></h3>
<p>The journey into the world of containerization began with the installation of Docker. This enables the deployment of applications in isolated containers, ensuring efficient resource utilization and easy management. Minikube, on the other hand, introduces the capabilities of Kubernetes at a smaller scale, providing a robust platform for container orchestration.</p>
<p>Enabling secure remote access for development purposes, OpenSSH was configured. This feature facilitates a secure shell connection, allowing developers to access and manage the server remotely. Coupled with Visual Studio Code's Remote SSH extension, the development workflow is further streamlined, providing a seamless and efficient coding environment.</p>
<p>Check out <a class="user-mention" href="https://hashnode.com/@hiteshchoudharylco">Hitesh Choudhary</a> and other tutorials on YT to install these!</p>
<h2 id="heading-chapter-7-conclusion">Chapter 7: Conclusion</h2>
<p>Finally, I'm in a stage where I can shift my entire development workload onto a Linux system to avoid the incessant windows-shaming I've been facing over the years. Having a server ready can also help me get a better grip on computer networks and Linux filesystems as well as learn advanced DevOps topics.</p>
<hr />
<p>If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Striking the Right Chord: Gaming and Beyond with Python-Powered Audio Magic]]></title><description><![CDATA[Okay! I agree the title sounds too overly technical. Long story short, I made a program to play Counter-Strike using my guitar. Why write a blog about a program that lets you play Counter-Strike with a guitar? Well, truth be told, it doesn't have muc...]]></description><link>https://highonbugs.sbk2k1.in/sows</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/sows</guid><category><![CDATA[Python]]></category><category><![CDATA[audio]]></category><category><![CDATA[guitar]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Thu, 05 Oct 2023 18:05:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1696528699376/123ad5c6-b8db-4a2a-9552-2ffb39ac9c54.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Okay! I agree the title sounds too overly technical. Long story short, I made a program to play Counter-Strike using my guitar. Why write a blog about a program that lets you play Counter-Strike with a guitar? Well, truth be told, it doesn't have much practical use. So why bother? The reason is to spotlight the valuable building blocks within the project that could benefit you.</p>
<p>If you want to check the software out for yourself, visit the <a target="_blank" href="https://sows.sbk2k1.tech/">SOWS Website</a>.</p>
<p>Here is an early usage video of the software.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=NNfp-58yXsA">https://www.youtube.com/watch?v=NNfp-58yXsA</a></div>
<p> </p>
<p>So let us get into answering the questions of how and why.</p>
<h2 id="heading-introduction">Introduction</h2>
<h3 id="heading-capabilities">Capabilities</h3>
<ol>
<li><p><strong>Driver-Swapping Wizardry</strong>: This thing can seamlessly switch between different audio drivers. Whether you're rocking out with your trusty headphones or getting all fancy with an audio interface, it's got your back.</p>
</li>
<li><p><strong>Audio Detective Mode</strong>: You can stop actions and see the notes you are producing. So in theory you can replace your guitar tuner. (Not Recommended)</p>
</li>
<li><p><strong>Musical Notes Meet Gaming Actions</strong>: Here's the kicker – it takes those signals, turns them into musical notes, and then maps those notes to in-game actions. Picture strumming a power chord to toss a grenade or hitting a sweet riff to reload your weapon. It's not that crazy. (Yet?)</p>
</li>
</ol>
<h3 id="heading-motivation-behind-the-madness"><strong>Motivation Behind the Madness:</strong></h3>
<p>Now, you're probably wondering why in the world someone would come up with this concoction. Well, I wanted to merge two things I absolutely adore: jamming on the guitar and going all out in Counter-Strike. The wild and wacky creations of developers like <a target="_blank" href="https://www.youtube.com/@MichaelReeves">Michael Reeves</a> were a catalyst. He showed me that the craziest ideas can lead to the most fun and innovative projects.</p>
<h1 id="heading-i-the-foundations">I. The Foundations</h1>
<h3 id="heading-section-11-python-object-oriented-programming-oop">Section 1.1: <strong>Python Object-Oriented Programming (OOP)</strong></h3>
<p><strong>Importance of OOP in Software Development:</strong> Object-oriented programming (OOP) is the backbone of many modern software projects, and it plays a crucial role in making code more organized, modular, and maintainable. It's like building with Lego bricks; you create reusable components (objects) with their own data and behavior, making it easier to manage complexity as your project grows.</p>
<p><strong>Project Structure Using Python Classes and Objects:</strong> In my project, we've leveraged OOP to structure the code effectively. Let's take a quick peek at how it's done:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Ui_MainWindow</span>:</span>
    <span class="hljs-comment"># This class handles the user interface of my application.</span>
    <span class="hljs-comment"># It's structured using Qt Designer and PyQt5.</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CustomEventFilter</span>:</span>
    <span class="hljs-comment"># CustomEventFilter is a class designed to filter and process user input events.</span>
    <span class="hljs-comment"># It prevents keyboard mappings from interacting with the program itself</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ActionHandler</span>:</span>
    <span class="hljs-comment"># ActionHandler is responsible for mapping audio signals to specific in-game actions.</span>
    <span class="hljs-comment"># It encapsulates the logic for translating guitar sounds into game commands.</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AudioProcessingThread</span>:</span>
    <span class="hljs-comment"># AudioProcessingThread is a separate thread that handles audio detection and processing.</span>
    <span class="hljs-comment"># This ensures that audio-related tasks don't block the main application thread.</span>
</code></pre>
<p><strong>Code Examples Illustrating Key OOP Concepts:</strong> Let's delve into some code snippets to see OOP concepts in action within my software:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Example of Encapsulation</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ActionHandler</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        self.actions = {}  <span class="hljs-comment"># This dictionary encapsulates our actions and their corresponding mappings.</span>

<span class="hljs-comment"># Example of Inheritance</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CustomEventFilter</span>(<span class="hljs-params">QEventFilter</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        super().__init__()  <span class="hljs-comment"># We inherit the behavior of QEventFilter to customize event handling.</span>

<span class="hljs-comment"># Example of Polymorphism</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AudioProcessingThread</span>(<span class="hljs-params">QThread</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-comment"># Here, we override the 'run' method to provide our behavior while utilizing QThread's functionality.</span>
</code></pre>
<h3 id="heading-section-12-desktop-application-with-pyqt5-and-pyside"><strong>Section 1.2: Desktop Application with PyQt5 and PySide</strong></h3>
<p><strong>Choice of PyQt5 and PySide:</strong> I opted for PyQt5 and PySide to create the desktop application because of my past (yet limited) experience working with it for <a target="_blank" href="https://mnemosyne.sbk2k1.tech/">Mnemosyne</a>.</p>
<p><strong>Creating a Desktop App in Python:</strong> Building a desktop app in Python using PyQt5 and PySide involves several steps:</p>
<ol>
<li><p><strong>Designing the UI</strong>: We've used Qt Designer to design the user interface. It allows for a visual drag-and-drop interface design, which simplifies the process.</p>
</li>
<li><p><strong>Creating the Main Window</strong>: We've created a <code>Ui_MainWindow</code> class to set up the main window of the application. This class is generated based on previously created UI design.</p>
</li>
<li><p><strong>Event Handling</strong>: The <code>CustomEventFilter</code> class handles user input events, ensuring that the app is not affected by actions triggered by the audio.</p>
</li>
<li><p><strong>Threading</strong>: To keep the app responsive, we use the <code>AudioProcessingThread</code> class to handle audio detection in a separate thread, preventing the main thread from being blocked.</p>
</li>
</ol>
<p><strong>Showcasing the User Interface:</strong> My user interface, designed with PyQt5 and PySide, provides an intuitive way to interact with the application. Users can select audio drivers, visualize audio input, start a test mode to check out the note detected, and review the controls.</p>
<p>That's the foundation of the project structure and how I harnessed OOP principles to keep things organized and manageable while creating a desktop application using PyQt5 and PySide. These elements work in harmony to bring the magic of playing Counter-Strike with a guitar to life!</p>
<h2 id="heading-ii-audio-processing">II. Audio Processing</h2>
<h3 id="heading-section-21-audio-driver-integration">Section 2.1: <strong>Audio Driver Integration</strong></h3>
<p><strong><em>Interacting with Audio Drivers</em></strong>: In the heart of the software, there's a crucial component that enables the magic to happen - the interaction with audio drivers. This interaction allows us to tap into the audio data from your audio driver and make sense of it. Here's a glimpse of how it works:</p>
<p>Our software utilizes the <a target="_blank" href="https://pypi.org/project/PyAudio/">PyAudio</a> library to manage audio input. With PyAudio, we can establish connections with audio drivers, be it your headphone drivers or any audio interface you prefer. This gives us access to the raw audio data that flows through your system.</p>
<p><strong><em>Challenges and Considerations</em></strong>: Working with audio interfaces presents its fair share of challenges. Ensuring compatibility across a wide range of drivers and hardware configurations can be tricky. We must consider issues like latency, device selection, and data format when dealing with these interfaces.</p>
<p>To tackle these challenges, I configured the software with the following settings:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Audio settings</span>
self.buffer_size = <span class="hljs-number">1024</span>
self.pyaudio_format = pyaudio.paFloat32
self.n_channels = <span class="hljs-number">1</span>
self.samplerate = <span class="hljs-number">48000</span>
self.lowest_pitch = <span class="hljs-number">5</span>
self.testing = <span class="hljs-literal">False</span>
</code></pre>
<p>These settings help us ensure that the audio input is processed efficiently and that the guitar notes are captured accurately.</p>
<p><strong>Code Snippets:</strong> Here's an example of how we set up the audio stream with PyAudio in the software:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Initialize PyAudio</span>
audio = pyaudio.PyAudio()

<span class="hljs-comment"># Configure audio stream</span>
self.buffer_size = <span class="hljs-number">1024</span>
self.pyaudio_format = pyaudio.paFloat32
self.n_channels = <span class="hljs-number">1</span>
self.samplerate = <span class="hljs-number">48000</span>

<span class="hljs-comment"># Open an audio stream</span>
self.stream = audio.open(format=self.pyaudio_format,
                         channels=self.n_channels,
                         rate=self.samplerate,
                         input=<span class="hljs-literal">True</span>,
                         frames_per_buffer=self.buffer_size)
</code></pre>
<p>This code establishes a connection with the audio driver, configures the audio stream, and prepares to receive audio data from your guitar.</p>
<h3 id="heading-section-22-fourier-transform-for-frequency-analysis">Section <strong>2.2: Fourier Transform for Frequency Analysis</strong></h3>
<p><strong><em>Introduction to Fourier Transform</em></strong>*:* Imagine you have a complex sound, like the music from your guitar. This sound is made up of various individual musical notes. It is a mixture of different waveforms with varying frequencies rather than a single waveform with consistent identifiable parameters.</p>
<p>Here's how it works:</p>
<ol>
<li><p><strong>Sound Waves as Building Blocks:</strong> Sound is a wave, and complex sounds are made up of simpler waveforms. Each musical note you play on your guitar can be thought of as a unique waveform.</p>
</li>
<li><p><strong>Breaking Down the Sound:</strong> The Fourier Transform takes the complex sound and breaks it down into these simple waveforms, sort of like taking a big jigsaw puzzle and separating it into its individual pieces.</p>
</li>
<li><p><strong>Frequency Analysis:</strong> Each of these simple waveforms represents a specific musical note, and they have different frequencies.</p>
</li>
<li><p><strong>Quantifying the Notes:</strong> The Fourier Transform quantifies how much of each of these simple waveforms is present in the complex sound. It tells us, "Hey, you've got a lot of this note, a little of that note," and so on.</p>
</li>
</ol>
<p>So, in a nutshell, the Fourier Transform is like a magical tool that takes a complex sound, dissects it into its individual musical notes, and tells us how much of each note is in there. It's like breaking down a song into its musical ingredients, and it's incredibly useful in understanding and working with sounds in various fields, from music to engineering.</p>
<p>You can check out this amazing video from 3B1B to get it:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=spUNpyF58BY&amp;t=850s">https://www.youtube.com/watch?v=spUNpyF58BY&amp;t=850s</a></div>
<p> </p>
<p><strong><em>Role of Fourier Transform</em></strong>*:* The Fourier Transform is a go-to technique for dissecting audio signals. It breaks down complex sound waves into their individual frequency components. For us, this means identifying the notes your guitar is playing by examining the frequency of the sound.</p>
<p><strong><em>Fourier Transform Implementation</em></strong>*:* Here's a high-level overview of how we implement the Fourier Transform in the software:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> aubio
<span class="hljs-comment"># Initialize the Fourier Transform object</span>
pitches = aubio.pitch(<span class="hljs-string">"yin"</span>, self.buffer_size, self.buffer_size, self.samplerate)
<span class="hljs-comment"># Perform the Fourier Transform on audio data</span>
pitches, conf = pitches(self.audio_data)
<span class="hljs-comment"># Extract the pitch (frequency) information</span>
pitch_frequency = pitches[<span class="hljs-number">0</span>]
</code></pre>
<p>In this code snippet, we use the Aubio library to create a pitch detection object. We then feed it the audio data, and it returns the pitch or frequency information. This frequency data is what we use to map guitar notes to in-game actions.</p>
<p>So, there you have it! We've unveiled how the software handles audio drivers, grapples with audio interfaces, and employs the Fourier Transform to turn your guitar sounds into a symphony of gaming actions. With these elements in play, you're one step closer to rocking out in Counter-Strike like never before!</p>
<h2 id="heading-iii-mapping-audio-frequencies-to-actions">III. <strong>Mapping Audio Frequencies to Actions</strong></h2>
<p><strong><em>Mapping Audio Frequencies to Actions</em></strong>*:* Now, let's delve into the exciting part – mapping the frequencies generated by your guitar to specific keyboard and mouse actions within the software. This is where the magic happens! We've got a dictionary of musical notes, and each note corresponds to a unique action, such as moving, shooting, or jumping.</p>
<p><strong><em>Examples of Frequency-to-Action Mapping</em></strong>*:* Here's a sneak peek at how some musical notes translate into actions:</p>
<ul>
<li><p>When you play the note "C," it's like moving forward in the game.</p>
</li>
<li><p>If you hit "D," it's equivalent to moving backward.</p>
</li>
<li><p>"G#" on your guitar will make your character jump in the game.</p>
</li>
<li><p>"A#" triggers shooting, while "B" initiates a reload.</p>
</li>
</ul>
<p>These mappings allow you to control your game character by playing your guitar. It's like turning your guitar into a gaming controller!</p>
<p><em>Toggles and Logic:</em> But hold on, there's more! The software incorporates toggles and logic to make the gameplay experience smoother. For instance, if you play the "A" note, it toggles crouching on or off. So, the first "A" press crouches, and the next one stands your character back up. This smart logic ensures that you have control over these actions without any fuss. Moreover, we control 2D movement using a List of Movements across X and Y dimensions. Let's break down the logic and usage of lists in these functions:</p>
<pre><code class="lang-python"><span class="hljs-comment"># function to move forward or stop moving backwards</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">moveForward</span>(<span class="hljs-params">self</span>):</span>
    <span class="hljs-keyword">if</span> self.navigation_mapping[<span class="hljs-number">0</span>] == <span class="hljs-literal">None</span>:
        self.navigation_mapping[<span class="hljs-number">0</span>] = <span class="hljs-number">1</span>
        keyboard.press(<span class="hljs-string">'w'</span>)
    <span class="hljs-keyword">elif</span> self.navigation_mapping[<span class="hljs-number">0</span>] == <span class="hljs-number">-1</span>:
        self.navigation_mapping[<span class="hljs-number">0</span>] = <span class="hljs-literal">None</span>
        keyboard.release(<span class="hljs-string">'s'</span>)
</code></pre>
<p>this code, we have a function called <code>moveForward</code> that is responsible for moving your character forward in the game. Here's how it works:</p>
<ol>
<li><p><code>self.navigation_mapping</code> is a list used to keep track of your character's movement in two directions: forward/backward (index 0) and left/right (index 1).</p>
</li>
<li><p>When you call <code>moveForward</code>, the function first checks if <code>self.navigation_mapping[0]</code> is <code>None</code>. This check ensures that if you're already moving backward (indicated by <code>-1</code>), you won't start moving forward again immediately.</p>
</li>
<li><p>If <code>self.navigation_mapping[0]</code> is indeed <code>None</code>, it sets it to <code>1</code> to indicate that your character is now moving forward. Additionally, it simulates a key press of the 'w' key using <a target="_blank" href="http://keyboard.press"><code>keyboard.press</code></a><code>('w')</code>.</p>
</li>
<li><p>However, if <code>self.navigation_mapping[0]</code> is <code>-1</code>, it means your character is currently moving backward. In this case, the function sets <code>self.navigation_mapping[0]</code> back to <code>None</code> to stop moving backward and releases the 's' key to stop moving in that direction.</p>
</li>
</ol>
<p>This logic ensures that your character can smoothly switch between moving forward and stopping moving backward, allowing for responsive and intuitive control.</p>
<p>Similar logic applies to the <code>moveBackwards</code> and <code>moveLeft</code> functions, where the list <code>self.navigation_mapping</code> is used to keep track of the character's movement in different directions, and key presses and releases are simulated accordingly to provide seamless control over your character's movement.</p>
<h2 id="heading-iv-practical-applications">IV. Practical Applications</h2>
<h3 id="heading-section-41-practical-applications">Section 4.1: Practical Applications</h3>
<p>Nothing! Helping you understand the different parts used in making the software and instigating ideas.</p>
<h3 id="heading-section-42-creative-possibilities">Section 4.2: Creative Possibilities</h3>
<p><em>Brainstorm Your Own Ideas:</em> I encourage you to think outside the box. This software is a canvas for your creativity. Pick out individual parts and apply them to your own ideas.</p>
<h2 id="heading-v-conclusion-and-further-exploration">V. Conclusion and Further Exploration</h2>
<p><strong><em>Summarizing the Journey</em></strong>*:* In this blog post, we've embarked on a journey into the world of using your guitar as a gaming controller. We've explored the technical aspects, mapped out actions, and even dabbled in the potential beyond just gaming with your guitar.</p>
<p><strong><em>Unleash Your Creativity</em></strong>*:* The software is a starting point for your own tech adventures. We invite you to explore each element in more detail. Dive into Python Object-Oriented Programming, discover the wonders of audio processing, and experiment with game development or other fields. Here are some resources to get you started:</p>
<ul>
<li><p><a target="_blank" href="https://www.python.org/doc/">Link to Python Documentation</a>: Learn more about Python, and the language behind the software.</p>
</li>
<li><p><a target="_blank" href="https://doc.qt.io/qtforpython-6/">Link to PyQt Documentation</a>: Learn about the framework used to make GUIs</p>
</li>
<li><p>I'm sure you can look around for other stuff yourself. :)</p>
</li>
</ul>
<h2 id="heading-call-to-action">Call to Action</h2>
<p>Now, it's your turn! Download the software, give it a whirl, and share your experiences with us. We'd love to hear your feedback and suggestions for improvement. Join us in this journey of creativity and innovation, where music meets technology in exciting ways. Who knows what you'll come up with next?</p>
<p>I can also make it an Open-Source Project. Some features I would love to add.</p>
<ul>
<li><p>Customizable Actions.</p>
</li>
<li><p>A Desktop overlay while playing games that shows active buttons.</p>
</li>
<li><p>Suggest your own.</p>
</li>
</ul>
<p>My Socials - <a target="_blank" href="https://t.co/OmrSzziq5s">here</a>!</p>
<hr />
<p>If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Introduction to MongoDB - Part 1]]></title><description><![CDATA[MongoDB is a popular NoSQL document database that stores data in flexible, JSON-like documents. Unlike traditional SQL databases, MongoDB does not require a predefined schema, making it easier to store and query data of varying types and structures.
...]]></description><link>https://highonbugs.sbk2k1.in/mongodb-part-1</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/mongodb-part-1</guid><category><![CDATA[MongoDB]]></category><category><![CDATA[Databases]]></category><category><![CDATA[NoSQL]]></category><category><![CDATA[data]]></category><category><![CDATA[MERN Stack]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Fri, 17 Mar 2023 20:28:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1679082874199/89660b92-c7e3-499d-972f-c0d5fcdd7063.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>MongoDB is a popular NoSQL document database that stores data in flexible, JSON-like documents. Unlike traditional SQL databases, MongoDB does not require a predefined schema, making it easier to store and query data of varying types and structures.</p>
<p>One of the main reasons why MongoDB is a popular choice for backend development is its scalability. MongoDB is designed to scale horizontally by adding more servers to handle increased traffic and data volume, making it an ideal choice for large-scale, high-traffic applications. Additionally, MongoDB's flexible data model makes it easier to adapt to changing business needs, reducing development time and effort.</p>
<p>Another key advantage of MongoDB is its ease of use. With MongoDB, developers can write queries using a simple syntax similar to JSON, which is easier to learn and understand than SQL. Additionally, MongoDB's query language supports a wide range of operations and allows for complex queries, making it well-suited for data analytics and reporting.</p>
<p>Finally, MongoDB is highly versatile and can be used in a wide range of applications, including web and mobile applications, content management systems, and Internet of Things (IoT) devices. Its compatibility with popular programming languages and frameworks, including Node.js and the MERN stack, makes it a popular choice for modern web application development.</p>
<h3 id="heading-how-is-mongodb-different">How is MongoDB different?</h3>
<p>MongoDB is a popular NoSQL document-oriented database that stores data in a flexible, JSON-like format called BSON. Unlike traditional relational databases, MongoDB does not require a predefined schema or structure for data storage, making it easier to store and query data of varying types and structures.</p>
<p>Some of the key differences between MongoDB and relational databases/SQL databases include:</p>
<ol>
<li><p><strong>Data model</strong>: MongoDB uses a document-based data model, whereas SQL databases use a table-based model. In MongoDB, documents are stored in collections, which can contain different types of data, while SQL databases require data to be structured in tables with predefined columns.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1679084683601/f91f05a5-b01a-4aed-b0fd-938f1f976472.jpeg" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Schema design</strong>: MongoDB does not require a predefined schema for data storage, whereas SQL databases require a schema to be defined in advance. This means that MongoDB can be more flexible when it comes to schema design, as changes can be made on-the-fly without affecting the overall database structure.</p>
</li>
<li><p><strong>Query language</strong>: MongoDB uses a simple, JSON-like query language, whereas SQL databases use SQL (Structured Query Language), which is more complex and requires knowledge of database schemas and table structures.</p>
</li>
</ol>
<p>Advantages of MongoDB:</p>
<ol>
<li><p><strong>Scalability</strong>: MongoDB is highly scalable and can easily handle large amounts of data and high levels of traffic by distributing data across multiple servers.</p>
</li>
<li><p><strong>Flexibility</strong>: MongoDB's flexible data model allows for easier schema design and more efficient storage and querying of complex data structures.</p>
</li>
<li><p><strong>Performance</strong>: MongoDB's indexing and sharding capabilities enable faster query times and efficient data retrieval.</p>
</li>
<li><p><strong>Open source</strong>: MongoDB is an open-source platform with a large community of developers, making it easy to find support and resources.</p>
</li>
</ol>
<p>Disadvantages of MongoDB:</p>
<ol>
<li><p><strong>Complexity</strong>: While MongoDB's flexible data model is an advantage, it can also be a disadvantage as it can be more complex to manage and query than a structured, relational database.</p>
</li>
<li><p><strong>Memory usage</strong>: MongoDB requires more memory than traditional SQL databases, as it relies heavily on in-memory caching for performance.</p>
</li>
<li><p><strong>Lack of transaction support</strong>: MongoDB does not support ACID transactions, which can make it more challenging to ensure data consistency and integrity in some use cases.</p>
</li>
</ol>
<h3 id="heading-terminologies">Terminologies</h3>
<p>Some of the terminologies used in MongoDB are:</p>
<ol>
<li><p><strong>Database</strong>: A MongoDB database is a container for collections of documents. Each database can have multiple collections, and collections can have multiple documents.</p>
</li>
<li><p><strong>Collection</strong>: A MongoDB collection is a group of documents that share a similar structure. Collections are analogous to tables in a relational database, but they do not have a predefined schema or structure.</p>
</li>
<li><p><strong>Document</strong>: In MongoDB, a document is a record stored in a collection. A document is represented as a JSON-style object and can contain fields of various data types.</p>
</li>
<li><p><strong>Schema</strong>: A MongoDB schema defines the structure of documents within a collection, including the fields and data types for each field. Unlike a traditional database schema, MongoDB schemas are flexible and can be changed dynamically.</p>
</li>
<li><p><strong>Model</strong>: In the context of Node.js and MongoDB, a model is a JavaScript object that represents a collection in MongoDB. A model provides an interface for querying and modifying documents in a collection.</p>
</li>
<li><p><strong>Index</strong>: A MongoDB index is a data structure that improves the speed of data retrieval operations by allowing queries to quickly locate documents that match certain criteria.</p>
</li>
<li><p><strong>Aggregation</strong>: MongoDB aggregation refers to the process of combining multiple documents from one or more collections to perform a set of data processing operations, such as filtering, sorting, and grouping.</p>
</li>
</ol>
<p>These are some of the most commonly used terminologies in MongoDB, but there are many others as well.</p>
<h3 id="heading-how-to-design-a-schema">How to design a Schema?</h3>
<p>A schema is a blueprint that defines the structure of documents within a collection. Unlike traditional databases, MongoDB allows for flexible schemas, which means that documents within a collection can have different fields and data types.</p>
<p><strong>Fields</strong> in MongoDB refer to the individual pieces of data stored within a document. A field consists of a name and a value. The name of a field is a string that identifies the data stored within it, and the value can be of any data type supported by MongoDB.</p>
<p><img src="https://studio3t.com/wp-content/uploads/2018/10/mongodb-document-structure.png" alt="Getting Started with MongoDB – An Introduction | Studio 3T" /></p>
<p><strong>Types</strong> in MongoDB refer to the various data types that can be used to represent data within a field. Some common data types in MongoDB include:</p>
<ol>
<li><ul>
<li><p><strong>String</strong> − This is the most commonly used datatype to store the data. The string in MongoDB must be UTF-8 valid.</p>
<ul>
<li><p><strong>Integer</strong> − This type is used to store a numerical value. Integer can be 32-bit or 64-bit depending upon your server.</p>
</li>
<li><p><strong>Boolean</strong> − This type is used to store a boolean (true/ false) value.</p>
</li>
<li><p><strong>Double</strong> − This type is used to store floating point values.</p>
</li>
<li><p><strong>Min/ Max keys</strong> − This type is used to compare a value against the lowest and highest BSON elements.</p>
</li>
<li><p><strong>Arrays</strong> − This type is used to store arrays or lists or multiple values into one key.</p>
</li>
<li><p><strong>Timestamp</strong> − This can be handy for recording when a document has been modified or added.</p>
</li>
<li><p><strong>Object</strong> − This datatype is used for embedded documents.</p>
</li>
<li><p><strong>Null</strong> − This type is used to store a Null value.</p>
</li>
<li><p><strong>Symbol</strong> − This datatype is used identically to a string; however, it’s generally reserved for languages that use a specific symbol type.</p>
</li>
<li><p><strong>Date</strong> − This datatype is used to store the current date or time in UNIX time format. You can specify your own date time by creating an object of Date and passing a day, month, or year into it.</p>
</li>
<li><p><strong>Object ID</strong> − This datatype is used to store the document’s ID.</p>
</li>
<li><p><strong>Binary data</strong> − This datatype is used to store binary data.</p>
</li>
<li><p><strong>Code</strong> − This type is used to store JavaScript code in the document.</p>
</li>
<li><p><strong>Regular expression</strong> − This datatype is used to store regular expressions.</p>
</li>
</ul>
</li>
</ul>
</li>
</ol>
<p>MongoDB also supports "pre" and "post" functions that allow developers to add custom functionality to document and query operations. "Pre" functions are executed before a specific operation, such as insert or update, and can be used to perform validations or transformations on the data being modified. "Post" functions are executed after a specific operation, such as insert or find, and can be used to perform additional processing on the result data.</p>
<p>These functions can be defined using MongoDB's built-in functions or custom JavaScript functions. They can be used to perform a wide variety of operations, such as data validation, transformation, logging, and more.</p>
<h3 id="heading-querying-and-manipulation">Querying and Manipulation</h3>
<p>MongoDB offers powerful querying and aggregation capabilities, as well as support for indexing, to enable efficient and flexible data retrieval and manipulation.</p>
<p><strong>Querying</strong>: MongoDB supports a wide range of query operators and methods for retrieving documents from collections based on specific criteria. Queries can be based on a single field, a combination of fields, or even nested fields within a document. MongoDB also supports a flexible query language that allows for complex logical expressions and regular expressions.</p>
<p><strong>Aggregation</strong>: MongoDB's aggregation pipeline provides a flexible and powerful way to perform complex data transformations and analysis on collections. The pipeline consists of stages that can be used to filter, transform, group, and aggregate data in a variety of ways. Each stage in the pipeline takes input from the previous stage and passes the output to the next stage. This <a target="_blank" href="https://www.youtube.com/watch?v=A3jvoE0jGdE&amp;list=PLWkguCWKqN9OwcbdYm4nUIXnA2IoXX0LI">playlist</a> by Bogdan Stashchuk has everything you need to get started working with MongoDB aggregation.</p>
<p><strong>Indexing</strong>: MongoDB supports a wide range of indexing options to improve query performance and data retrieval times. Indexes can be created on single fields or combinations of fields within a collection. MongoDB supports several types of indexes, including unique indexes, text indexes, and geospatial indexes. Indexes can significantly improve query performance, especially for large collections, and can also support efficient sorting and range queries.</p>
<h3 id="heading-scaling-techniques">Scaling Techniques</h3>
<p>Scaling MongoDB to handle larger datasets requires some or a combination of some techniques. They are:</p>
<ol>
<li><p><strong>Sharding</strong>: Sharding is a technique for horizontally scaling MongoDB across multiple servers. With sharding, data is distributed across multiple shards, which are groups of servers that each contain a subset of the data. Sharding can help to improve performance and handle larger datasets by allowing MongoDB to distribute the workload across multiple servers.</p>
</li>
<li><p><strong>Replication</strong>: Replication is a technique for ensuring high availability and fault tolerance in MongoDB. With replication, multiple copies of the data are stored across multiple servers. Changes made to the data on one server are automatically replicated to the other servers in the replication set. This can help to improve performance and ensure that data is always available, even in the event of a server failure.</p>
</li>
<li><p><strong>Indexing</strong>: Indexing is a technique for improving query performance by creating indexes on fields within a collection. Indexes allow MongoDB to retrieve and sort data efficiently, which can help to improve performance and reduce query times.</p>
</li>
<li><p><strong>Compression</strong>: MongoDB supports several compression techniques that can be used to reduce the size of the data stored in the database. Compression can help to reduce storage requirements and improve performance, especially for larger datasets.</p>
</li>
<li><p><strong>Caching</strong>: Caching is a technique for improving performance by storing frequently accessed data in memory. MongoDB supports several caching mechanisms, including the WiredTiger cache, which can help to improve performance and reduce query times.</p>
</li>
<li><p><strong>Load balancing</strong>: Load balancing is a technique for distributing incoming traffic across multiple servers to improve performance and reduce the risk of overloading any individual server. Load balancing can help to improve performance and handle larger datasets by distributing the workload across multiple servers.</p>
</li>
</ol>
<hr />
<p>I hope this was an informative blog on MongoDB. There will be a second part that will build a much more practical understanding of MongoDB and its capabilities. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Introduction to Machine Learning]]></title><description><![CDATA[Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves teaching machines to learn from data without being explicitly programmed. The machine learning algorithms automatically learn patterns and insights from the data and use ...]]></description><link>https://highonbugs.sbk2k1.in/introduction-to-machine-learning</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/introduction-to-machine-learning</guid><category><![CDATA[Machine Learning]]></category><category><![CDATA[AI]]></category><category><![CDATA[Data Science]]></category><category><![CDATA[Deep Learning]]></category><category><![CDATA[ML]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Thu, 16 Mar 2023 13:38:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678971321130/a7fd00df-597f-4ba6-a534-6ae681a8af58.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves teaching machines to learn from data without being explicitly programmed. The machine learning algorithms automatically learn patterns and insights from the data and use that learning to make predictions or decisions on new and unseen data. It is a data-driven approach that enables computers to improve their performance on a specific task as they gain more experience, without human intervention.</p>
<p>Machine Learning is needed because it enables computers to learn and improve from experience and data, which can lead to more accurate and efficient predictions, decision-making, and automation of tasks. For example, machine learning is used in image recognition, speech recognition, natural language processing, fraud detection, recommendation systems, and many other applications that rely on pattern recognition and prediction.</p>
<h3 id="heading-how-is-it-different-from-ai-deep-learning-and-data-science">How is it different from AI, Deep Learning, and Data Science?</h3>
<p><img src="https://www.corpnce.com/wp-content/uploads/2019/08/DS_MLRelationship.jpg" alt="How are AI, Machine Learning, Deep Learning &amp; Data Science Related?" /></p>
<p>Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves the use of algorithms and statistical models to enable machines to learn from data without being explicitly programmed. ML algorithms can be classified into three main types: supervised learning, unsupervised learning, and reinforcement learning.</p>
<p>Artificial Intelligence (AI) is a broader field that aims to create machines that can perform tasks requiring human-like intelligence, including problem-solving, decision-making, and natural language understanding. AI is composed of multiple subfields, including ML, natural language processing, robotics, computer vision, and more.</p>
<p>Deep Learning (DL) is a subset of ML that uses artificial neural networks with multiple layers to learn and extract high-level representations of data. DL has been used to achieve state-of-the-art results in tasks such as image recognition, natural language processing, and speech recognition.</p>
<p>Data Science (DS) is an interdisciplinary field that involves the extraction, analysis, and interpretation of data using statistical and computational methods to extract insights and knowledge from data. DS combines skills and techniques from various fields, including statistics, mathematics, computer science, and domain-specific knowledge, to derive insights and make decisions from data.</p>
<p>In summary, AI is the broadest field that encompasses ML and DL, which are specific subfields of AI. Data Science is a separate field that uses statistical and computational techniques to extract insights and knowledge from data. While they are all related, each field has its own unique focus and set of tools and techniques.</p>
<h3 id="heading-types-of-machine-learning">Types of Machine Learning</h3>
<p>There are three main types of Machine Learning: Supervised Learning, Unsupervised Learning, and Reinforcement Learning.</p>
<ol>
<li><strong>Supervised Learning</strong>: This type of machine learning involves training the model using a labeled dataset, which means the input data is already labeled with the desired output. The model learns to predict the output for new and unseen data based on the patterns and relationships it has learned from the labeled data. Some examples of supervised learning include:</li>
</ol>
<ul>
<li><p>Image Classification: Identifying whether an image contains a cat or a dog</p>
</li>
<li><p>Spam Detection: Classifying emails as spam or non-spam</p>
</li>
<li><p>Sentiment Analysis: Predicting whether a review is positive or negative</p>
</li>
</ul>
<ol>
<li><strong>Unsupervised Learning</strong>: This type of machine learning involves training the model on an unlabeled dataset, which means the input data is not labeled with the desired output. The model learns to identify patterns and relationships in the data without any guidance, and it groups similar data points together based on their similarities. Some examples of unsupervised learning include:</li>
</ol>
<ul>
<li><p>Clustering: Grouping customers based on their purchase history or behavior</p>
</li>
<li><p>Anomaly Detection: Identifying unusual patterns or outliers in a dataset</p>
</li>
<li><p>Dimensionality Reduction: Reducing the number of features in a dataset while retaining the most important information</p>
</li>
</ul>
<ol>
<li><strong>Reinforcement Learning</strong>: This type of machine learning involves training the model to make decisions based on trial and error. The model learns by interacting with its environment and receiving feedback in the form of rewards or penalties based on its actions. The goal of the model is to maximize the rewards it receives by learning from its mistakes. Some examples of reinforcement learning include:</li>
</ol>
<ul>
<li><p>Game Playing: Learning to play chess, go or other games by playing against itself or other opponents</p>
</li>
<li><p>Robotics: Learning to perform tasks such as walking, grasping objects, or navigating a maze</p>
</li>
<li><p>Recommendation Systems: Learning to recommend products or content based on user feedback and preferences.</p>
</li>
</ul>
<h3 id="heading-examples">Examples</h3>
<p>There are many real-life applications of Machine Learning (ML) across various industries. Here are some examples:</p>
<ol>
<li><p><strong>Image and speech recognition</strong>: Image and speech recognition technology uses ML algorithms to enable machines to recognize and understand images and spoken language. Some common applications of this technology include virtual assistants like Siri and Alexa, facial recognition technology used in security systems, and self-driving cars.</p>
</li>
<li><p><strong>Fraud detection</strong>: Financial institutions and credit card companies use ML algorithms to detect fraudulent transactions in real time. This technology enables companies to quickly identify and stop fraudulent transactions, protecting both the company and the customer.</p>
</li>
<li><p><strong>Healthcare</strong>: ML is used in healthcare to improve patient outcomes and reduce costs. It can be used to identify patterns in patient data to enable early detection and treatment of diseases, as well as to develop personalized treatment plans.</p>
</li>
<li><p><strong>Manufacturing</strong>: Manufacturing companies use ML algorithms to optimize their production processes, reduce waste, and improve product quality. For example, Tesla uses ML algorithms to improve the efficiency of their battery manufacturing process and to develop their autonomous driving technology.</p>
</li>
<li><p><strong>Customer service</strong>: Many companies use ML algorithms to improve their customer service operations. This can include chatbots that can answer customer questions and resolve issues, as well as personalized marketing campaigns that are tailored to each individual customer's preferences and behavior.</p>
</li>
</ol>
<hr />
<p>This was just a short intro for the ML series that I will start (eventually). There isn't much to learn from this blog, but it can serve as an intro for someone just starting out. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Web Basics - Part 4]]></title><description><![CDATA[What are Cookies, Local Storage, and Session Storage?
Cookies, local storage, and session storage are all ways to store data in a user's browser while they are interacting with a website.
Cookies are small text files that are stored on a user's compu...]]></description><link>https://highonbugs.sbk2k1.in/web-basics-part-4</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/web-basics-part-4</guid><category><![CDATA[cache]]></category><category><![CDATA[cookies]]></category><category><![CDATA[Session]]></category><category><![CDATA[localstorage]]></category><category><![CDATA[data]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Wed, 15 Mar 2023 10:29:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678873542544/c4c3e2d5-5220-4f2a-99d1-229d1666facd.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-what-are-cookies-local-storage-and-session-storage">What are Cookies, Local Storage, and Session Storage?</h3>
<p>Cookies, local storage, and session storage are all ways to store data in a user's browser while they are interacting with a website.</p>
<p>Cookies are small text files that are stored on a user's computer by a website they visit. They can be used to remember information about the user, such as their login details or their preferences for using the website. Cookies are important for web development because they allow websites to provide personalized experiences for users and to track user behavior for analytics purposes.</p>
<p>Local storage is a way to store larger amounts of data in a user's browser than is possible with cookies. Local storage is designed to be used for data that needs to persist beyond a single session, such as user preferences or settings. Local storage is important for web development because it provides a way to store data on the client side of a website, reducing the need to constantly request data from the server.</p>
<p>Session storage is similar to local storage, but the data it stores is only available for the duration of a user's session on a website. This means that once the user closes their browser or navigates away from the website, the data stored in session storage is deleted. Session storage is important for web development because it provides a way to store data temporarily during a user's session, such as items in a shopping cart or form data.</p>
<h3 id="heading-cookies">Cookies</h3>
<p>Cookies are small text files that are stored on a user's computer or device when they browse a website. These files contain information about the user's activities on the website, including preferences, settings, login information, and browsing history. Cookies are designed to enhance the user's experience by making it easier and faster for them to access the website's features and content.</p>
<p>There are several types of cookies:</p>
<ol>
<li><p><strong>Session cookies</strong>: These cookies are temporary and are deleted when the user closes their browser. They are used to remember the user's preferences and settings during a single session.</p>
</li>
<li><p><strong>Persistent cookies</strong>: These cookies are stored on the user's computer even after they close their browser. They are used to remember the user's preferences and settings for future sessions.</p>
</li>
<li><p><strong>First-party cookies</strong>: These cookies are set by the website that the user is visiting.</p>
</li>
<li><p><strong>Third-party cookies</strong>: These cookies are set by a third-party website, such as an advertising network or analytics service.</p>
</li>
</ol>
<p>Some examples of cookies include:</p>
<ol>
<li><p><strong>Authentication cookies</strong>: These cookies are used to remember a user's login information, such as their username and password.</p>
</li>
<li><p><strong>Shopping cart cookies</strong>: These cookies are used to remember the items that a user has added to their shopping cart on an e-commerce website.</p>
</li>
<li><p><strong>Analytics cookies</strong>: These cookies are used to track a user's behavior on a website, such as which pages they visit and how long they stay on each page.</p>
</li>
<li><p><strong>Advertising cookies</strong>: These cookies are used to display targeted ads to users based on their browsing history and interests.</p>
</li>
<li><p><strong>Social media cookies</strong>: These cookies are used to integrate social media features into a website, such as the ability to share content on social media platforms.</p>
</li>
</ol>
<h3 id="heading-local-storage">Local Storage</h3>
<p>Local storage is a web technology that allows web applications to store data on the client-side (user's browser) beyond the lifetime of a single session. It is a way for web developers to save information that persists even after the browser is closed, allowing the user to pick up where they left off the next time they visit the website.</p>
<p>Local storage works by providing a key-value storage mechanism for data. The data is stored in the user's browser in a separate storage area than cookies, with a much larger storage capacity. The data is stored as strings, but it can be converted to other data types using JavaScript methods.</p>
<p>The data stored in local storage is specific to the domain and protocol of the website, meaning that it cannot be accessed by other websites. It is also accessible to all scripts running on the website, making it a useful tool for sharing data between different parts of a web application.</p>
<p>Local storage is needed for several reasons:</p>
<ol>
<li><p><strong>Persistent data storage</strong>: Local storage allows web applications to store data that persists beyond the lifetime of a single session. This means that users can come back to the website at a later time and pick up where they left off, without losing any data.</p>
</li>
<li><p><strong>Improved performance</strong>: Local storage can improve the performance of web applications by reducing the need for server requests. By storing data locally, the website can access it more quickly and efficiently, without having to make requests to the server.</p>
</li>
<li><p><strong>Offline functionality</strong>: Local storage can be used to store data that is needed for offline functionality. For example, a web application that needs to be accessed in areas with poor internet connectivity can use local storage to store data that can be accessed even when the user is offline.</p>
</li>
<li><p><strong>Enhanced user experience</strong>: Local storage can be used to store user preferences, settings, and other data that can improve the user experience. By storing this data locally, the website can personalize the user's experience and provide a more seamless experience overall.</p>
</li>
</ol>
<p>Here are some examples of how local storage can be used:</p>
<ol>
<li><p><strong>Remembering user preferences</strong>: A website can use local storage to remember a user's preferences, such as their preferred language, font size, or theme. This allows the website to provide a more personalized experience for the user.</p>
</li>
<li><p><strong>Storing form data</strong>: When a user is filling out a form on a website, local storage can be used to save their progress. This way, if the user accidentally closes the browser or navigates away from the page, they can come back and continue where they left off.</p>
</li>
<li><p><strong>Saving game progress</strong>: Online games can use local storage to save a player's progress. This allows the player to come back later and resume playing from where they left off, without losing any progress.</p>
</li>
<li><p><strong>Cache management</strong>: Local storage can be used to store frequently used data, such as images or other assets, to reduce the number of server requests. This can improve the performance of the website and reduce page load times.</p>
</li>
<li><p><strong>Offline functionality</strong>: Web applications can use local storage to store data that is needed for offline functionality. For example, a note-taking application can use local storage to store the user's notes, which can be accessed even when the user is offline.</p>
</li>
</ol>
<h3 id="heading-session-storage">Session Storage</h3>
<p>Session storage is a web storage technology that allows web applications to store data on the client-side (user's browser) for the duration of a single session. A session is defined as the time period between when a user opens a website and when they close their browser or navigate away from the website. Session storage provides a way for web developers to store data temporarily, for use during the current session.</p>
<p>Session storage works by providing a key-value storage mechanism for data, similar to local storage. The data is stored in the user's browser and is specific to the domain and protocol of the website. However, unlike local storage, the data stored in session storage is deleted when the user closes their browser or navigates away from the website.</p>
<p>Here are some examples of how session storage can be used:</p>
<ol>
<li><p><strong>Anonymous Shopping cart</strong>: Session storage can be used to store the items in a user's shopping cart during a single session without logging in. This allows the user to add and remove items from their cart without losing any data.</p>
</li>
<li><p>Page state: Session storage can be used to store the state of a webpage during a session. For example, if a user is on a webpage that allows them to filter results, session storage can be used to store their filter preferences so that they can be applied to subsequent searches.</p>
</li>
</ol>
<h3 id="heading-other-data-storages">Other Data Storages</h3>
<p>Browser cache, IndexedDB, and Web SQL are three web technologies that can be used to store data on the client-side (user's browser) to improve website performance and provide offline functionality.</p>
<ol>
<li><p><strong>Browser cache</strong>: A browser cache is a mechanism used by web browsers to temporarily store web page data, such as HTML, CSS, and JavaScript files, images, and other assets. When a user visits a web page, the browser checks if it has cached versions of the requested resources. If cached versions exist, the browser loads them from the cache instead of making a new request to the server, which can significantly reduce page load times and improve website performance. The usefulness of browser cache is that it reduces the number of requests to the server, which in turn reduces server load and bandwidth usage. It also improves the user experience by allowing pages to load faster.</p>
</li>
<li><p><strong>IndexedDB</strong>: IndexedDB is a client-side database technology that allows web developers to store and retrieve large amounts of structured data in the user's browser. It is designed to be a scalable storage solution that can store large amounts of data and work efficiently with large datasets. IndexedDB works by storing data in key-value pairs, with the ability to create indexes for efficient data retrieval. Data stored in IndexedDB can be queried using various APIs, allowing web applications to manipulate and retrieve data as needed. The usefulness of IndexedDB is that it provides a way for web developers to store large amounts of data on the client side, which can improve website performance by reducing the number of server requests. It also provides offline functionality, allowing web applications to continue working even when the user is not connected to the internet.</p>
</li>
<li><p><strong>Web SQL</strong>: Web SQL is a deprecated client-side database technology that allowed web developers to store structured data in the user's browser. It used an SQLite database engine to provide a SQL-like interface for storing and retrieving data. The usefulness of Web SQL was that it provided a way for web developers to store data on the client side, allowing for improved website performance and offline functionality. However, it has been deprecated in favor of IndexedDB due to concerns about cross-browser compatibility and security.</p>
</li>
</ol>
<hr />
<p>These were some of the Web Storage Mechanisms, that help us in retaining/storing data on the client side. These help in a lot of business logic as well as user experience improvements. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Web Basics - Part 3]]></title><description><![CDATA[Basic Architecture of the Web
The basic architecture of the web is a distributed client-server architecture, where the client sends requests to a server and the server sends responses back to the client. The client is typically a web browser or other...]]></description><link>https://highonbugs.sbk2k1.in/web-basics-part-3</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/web-basics-part-3</guid><category><![CDATA[client]]></category><category><![CDATA[server]]></category><category><![CDATA[ip address]]></category><category><![CDATA[dns]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Tue, 14 Mar 2023 13:58:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678802233524/8f329586-e8c9-4581-91e9-8df07ab9c31b.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-basic-architecture-of-the-web">Basic Architecture of the Web</h3>
<p>The basic architecture of the web is a distributed client-server architecture, where the client sends requests to a server and the server sends responses back to the client. The client is typically a web browser or other application that requests information or services from a server. The server is a computer program or machine that provides the requested information or services to the client.</p>
<p>The client-server architecture is built on top of the TCP/IP protocol stack, which consists of a set of protocols that define how computers communicate over a network. At the application layer, the most commonly used protocols in the client-server architecture are HTTP (HyperText Transfer Protocol), FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol), and DNS (Domain Name System).</p>
<p><img src="https://miro.medium.com/v2/resize:fit:875/1*DUPqrw8b9G01NPpZox9hng.jpeg" alt /></p>
<p>HTTP is the protocol that defines how web pages are transferred between clients and servers. It is a request-response protocol, where the client sends a request to the server and the server sends a response back to the client. FTP is the protocol that defines how files are transferred between clients and servers. SMTP is the protocol that defines how email messages are transferred between clients and servers.</p>
<h3 id="heading-role-and-anatomy-of-ip-addresses">Role and Anatomy of IP Addresses</h3>
<p>An IP address is a unique numerical identifier that is assigned to devices connected to the internet. It stands for Internet Protocol address and is used to route data packets from one device to another.</p>
<p>There are two types of IP addresses: IPv4 and IPv6. IPv4 is a 32-bit address and can support up to 4.3 billion unique addresses. IPv6, on the other hand, is a 128-bit address and can support significantly more unique addresses. IPv6 was developed to address the shortage of IPv4 addresses.</p>
<p>IP addresses are used to identify devices on the internet. When a device connects to the internet, it is assigned an IP address. This IP address is used to route data packets from one device to another. When you enter a website's domain name into your browser, the browser uses DNS to translate that domain name into an IP address. The IP address is then used to establish a connection between your device and the website's server, allowing data to be transmitted back and forth.</p>
<p>An IP address consists of two parts: the network address and the host address. The network address is used to identify the network that the device belongs to, while the host address identifies the specific device on that network. The division between the network address and host address is determined by the subnet mask.</p>
<p>In IPv4, the first few bits of the address represent the network address, while the remaining bits represent the host address. The subnet mask is used to determine the number of bits used for the network address and the host address. For example, a subnet mask of 255.255.255.0 indicates that the first three octets of the IP address represent the network address, and the last octet represents the host address.</p>
<p>In IPv6, the first 64 bits of the address represent the network address, while the remaining 64 bits represent the host address. The division between the network address and host address is fixed and determined by the structure of the IPv6 address.</p>
<p>In summary, IP addresses are a crucial component of the internet as they allow devices to communicate with each other. They are used to identify devices on the internet and route data packets from one device to another.</p>
<h3 id="heading-how-does-dns-work">How does DNS work?</h3>
<p>DNS (Domain Name System) is a hierarchical system that translates domain names into IP (Internet Protocol) addresses and vice versa.</p>
<p>When a user types a domain name in their web browser, the computer sends a request to a DNS resolver to resolve the domain name into an IP address. The resolver first checks its cache to see if it has a record of the IP address for the domain name. If the resolver does not have a record, it sends a query to the root DNS server, which response with a referral to the appropriate Top-Level Domain (TLD) DNS server for that domain.</p>
<p>The TLD DNS server then responds with a referral to the authoritative DNS server for the domain, which has the IP address information for the domain. The resolver then caches the IP address information and returns it to the user's computer, allowing the web browser to connect to the IP address and load the website associated with the domain name.</p>
<p>Conversely, when a user enters an IP address in their web browser, the DNS resolver performs a reverse DNS lookup to translate the IP address into a domain name. The process is similar to the forward DNS lookup, but it involves querying different DNS servers and databases to find the domain name associated with the IP address.</p>
<p>Overall, the DNS system is crucial for navigating the internet and accessing websites through domain names rather than memorizing IP addresses.</p>
<p>Let's go through an example. Thank you <a class="user-mention" href="https://hashnode.com/@codedamncom">Codedamn</a> for explaining this to me. Check out <a target="_blank" href="https://codedamn.com/">Codedamn</a> for great content.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678798326069/8c760f5d-3cdb-41bd-a242-04bda18cb2b1.png" alt class="image--center mx-auto" /></p>
<p>The steps followed are:</p>
<ol>
<li><p>Our ISP already provides a DNS resolver or we do. It is a hard-coded IP address that maintains a kind-of look-up table for domains to their IP address.</p>
</li>
<li><p>On getting a request, our OS asks the DNS's IP what is the IP address of the website we are looking for. In this case "codedamn.com".</p>
</li>
<li><p>A request chain ensues where 1.1.1.1 (our IP) asks another server, for example, 65.24.11.22 where the IP address of all ".com" domains are stored. He replies with the required IP of the server. (77.22.11.2)</p>
</li>
<li><p>We do not get the IP of "codedamn.com" from 77.22.11.2. But it points us towards 44.11.232.55</p>
</li>
<li><p>We finally get the required IP from 77.22.11.2.</p>
</li>
</ol>
<p>These domain-to-IP conversions are cached so that the entire request chain is not executed again.</p>
<h3 id="heading-web-servers">Web Servers</h3>
<p>Web servers are software programs that run on a server computer and handle HTTP requests and responses for web pages and web applications. When a user requests a web page or application from a web server, the server responds with the requested content, which is then displayed in the user's web browser.</p>
<p>Apache and Nginx are two of the most popular web servers in use today. Here's a brief overview of how they work:</p>
<p><strong>Apache</strong>: Apache is an open-source web server that has been around since the mid-1990s. It uses a multi-process architecture where each incoming connection is handled by a separate process or thread. When a user makes an HTTP request, Apache receives the request and passes it to a worker process. The worker process then handles the request, generates the response, and sends it back to the user's web browser.</p>
<p>Apache is highly configurable and supports a wide range of modules that can be used to add functionality to the server. This makes it a popular choice for hosting dynamic websites and applications.</p>
<p><strong>Nginx</strong>: Nginx (pronounced "engine-x") is a lightweight, high-performance web server that was created in 2004. It uses an event-driven, non-blocking architecture that allows it to handle a large number of simultaneous connections without using a lot of system resources. When a user makes an HTTP request, Nginx receives the request and passes it to a worker process or thread. The worker process then handles the request, generates the response, and sends it back to the user's web browser.</p>
<p>Nginx is often used as a reverse proxy, which means it sits in front of other web servers and distributes incoming requests to those servers. This can help improve the performance and scalability of a web application.</p>
<p>Both Apache and Nginx are powerful web servers with their own strengths and weaknesses. Choosing between them often depends on the specific needs of a website or application, as well as the preferences of the web administrator.</p>
<p><strong>NOTE: You don't need to know about either of these servers to get into the backend development series. But basic knowledge is always a plus.</strong></p>
<h3 id="heading-sub-net-masks">Sub-Net Masks</h3>
<p>Subnet masks are a way of dividing a network into smaller subnetworks, or subnets. A subnet mask is a 32-bit binary number that is used to identify the network portion and the host portion of an IP address.</p>
<p>In IPv4, an IP address consists of 32 bits, divided into four 8-bit octets. Each octet represents a number between 0 and 255, and is separated by a period. For example, the IP address 192.168.0.1 is represented as 11000000.10101000.00000000.00000001 in binary.</p>
<p>A subnet mask is also a 32-bit binary number, where the network portion of the IP address is represented by a string of ones, and the host portion is represented by a string of zeros. For example, a subnet mask of 255.255.255.0 is represented as 11111111.11111111.11111111.00000000 in binary.</p>
<p>The subnet mask is used to determine which part of an IP address represents the network and which part represents the host. The network portion of an IP address is used to identify the network that the host belongs to, while the host portion is used to identify the individual host within the network.</p>
<p>By using subnet masks, a network can be divided into smaller subnets, each with its own network and host portion. This allows for more efficient use of IP addresses and can improve network performance and security.</p>
<p>In CIDR (Classless Inter-Domain Routing) notation, the subnet mask is specified by indicating the number of bits used for the network portion of the IP address. In the case of <code>/25</code>, which means that the first 25 bits of the IP address are used to identify the network portion, and the remaining 7 bits are used to identify the host portion.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/04/network-and-host-bits-2.png" alt="Subnet Mask Definition" /></p>
<p>For example, if a network has the IP address range 192.168.0.0/24 and a subnet mask of 255.255.255.0, it can be divided into smaller subnets with their own unique network portion and host portion. A subnet with the IP address range 192.168.0.0/25 would have a subnet mask of 255.255.255.128 and can support up to 126 hosts, while a subnet with the IP address range 192.168.0.128/25 would also have a subnet mask of 255.255.255.128.</p>
<h3 id="heading-security-considerations">Security Considerations</h3>
<p>Security considerations are crucial when it comes to both servers and clients. There are several key measures that can be taken to ensure the security of these systems, including the use of Firewalls, SSL/TLS, and HTTPS.</p>
<ol>
<li><p>In computing, a <strong>Firewall</strong> is a network security system that monitors and controls the incoming and outgoing network traffic based on predetermined security rules. A firewall typically establishes a barrier between a trusted network and an untrusted network, such as the Internet. At its most basic, a firewall is essentially the barrier that sits between a private internal network and the public Internet. A firewall’s main purpose is to allow non-threatening traffic in and to keep dangerous traffic out.</p>
</li>
<li><p><strong>SSL/TLS</strong> is another important security measure that is commonly used in both servers and clients. SSL (Secure Sockets Layer) and its successor, TLS (Transport Layer Security), are cryptographic protocols that provide secure communication over the internet. SSL/TLS is commonly used to secure web traffic, email, and other types of online communications. SSL/TLS works by encrypting data transmitted between a server and a client, making it difficult for unauthorized users to intercept and read the data.</p>
</li>
<li><p><strong>HTTPS</strong> is a secure version of the HTTP protocol used to transfer data over the internet. It uses SSL/TLS to encrypt the data transmitted between a server and a client, providing an additional layer of security to web traffic. HTTPS is commonly used to protect sensitive data, such as credit card numbers, passwords, and other personal information.</p>
</li>
</ol>
<p>It is essential to keep these measures up to date and regularly review and update security protocols to maintain the highest level of protection.</p>
<h3 id="heading-challenges-in-scalability">Challenges in Scalability</h3>
<p>Scalability is a crucial consideration for web servers that experience high levels of traffic. As web traffic increases, web servers may struggle to keep up with demand, which can lead to slow page load times, downtime, and other issues. There are several challenges that need to be addressed to achieve scalability in web servers, including load balancing, caching and clustering.</p>
<p>Load balancing is a technique that distributes incoming web traffic across multiple servers to avoid overloading any single server. Load balancing can be achieved through various methods, such as round-robin or weighted round-robin, IP hash, or least connections. Load balancing helps to ensure that incoming web traffic is evenly distributed among servers, reducing the load on any single server.</p>
<p>Caching is another technique that can be used to improve the scalability of web servers. Caching involves storing frequently accessed data in memory or on disk, which can significantly reduce the load on the server. Caching can be implemented at various levels, such as database caching, object caching, or page caching.</p>
<p>Clustering is a technique that involves grouping multiple servers together to act as a single system. Clustering can improve scalability by distributing the load among multiple servers, increasing the capacity of the system. Clustering also provides redundancy, which can help to ensure that the system remains available in the event of a server failure.</p>
<h3 id="heading-load-balancing">Load Balancing</h3>
<p>Load balancing is the process of distributing workloads or traffic evenly across multiple resources, such as servers or network links, in order to optimize resource utilization, increase performance, and improve the availability and reliability of the system.</p>
<p>Load balancing is needed because as a system grows, it can become overwhelmed by requests or traffic, causing slowdowns or failures. By distributing the workload across multiple resources, load balancing ensures that no single resource becomes overwhelmed, thereby improving the overall performance and availability of the system.</p>
<p>Load balancing is helpful in many ways, including:</p>
<ol>
<li><p><strong>Scalability</strong>: Load balancing allows a system to scale out by adding more resources to handle increasing demand, rather than scaling up by adding more capacity to a single resource.</p>
</li>
<li><p><strong>Fault tolerance</strong>: Load balancing ensures that if one resource fails, traffic can be automatically rerouted to another resource to avoid downtime or service interruption.</p>
</li>
<li><p><strong>Performance</strong>: Load balancing can improve the performance of a system by evenly distributing traffic across multiple resources, which can help reduce response times and increase throughput.</p>
</li>
</ol>
<p>There are different load-balancing algorithms that can be used to determine how traffic is distributed among resources. These include:</p>
<ol>
<li><p>Round-robin: Traffic is distributed in a cyclical manner among the available resources.</p>
<p> <img src="https://iq.opengenus.org/content/images/2020/07/roundrobinmodified.png" alt /></p>
</li>
<li><p>Least connections: Traffic is sent to the resource with the fewest active connections.</p>
</li>
<li><p>IP hash: Traffic is distributed based on the source or destination IP address.</p>
</li>
</ol>
<p>Load balancing can be implemented at different layers of the network stack, including the application layer, transport layer, and network layer. Application-level load balancing involves distributing traffic based on application-specific criteria, such as HTTP headers or cookies. Transport-level load balancing involves distributing traffic based on transport-layer protocols, such as TCP or UDP. Network-level load balancing involves distributing traffic based on network-layer protocols, such as IP addresses or routing information.</p>
<p>Overall, load balancing is a key technique for improving the performance, availability, and scalability of modern computer systems, and it plays an important role in ensuring that critical applications and services remain available and responsive to users.</p>
<h3 id="heading-content-delivery-networks-cdn">Content Delivery Networks (CDN)</h3>
<p>A CDN (Content Delivery Network) is a system of distributed servers that deliver web content to users based on their geographic location. The goal of a CDN is to reduce latency, improve page load times, and increase the availability of web content.</p>
<p>When a user requests a piece of content, such as an image or video, the request is routed to the nearest CDN server, which is usually the one with the lowest latency or closest geographic proximity to the user. The CDN server then serves the content to the user, bypassing the need for the request to travel back to the origin server where the content is hosted.</p>
<p>CDNs can improve performance in several ways:</p>
<ol>
<li><p><strong>Reduced latency</strong>: By serving content from a nearby server, CDNs can reduce the time it takes for content to travel from the origin server to the user's device, reducing latency and improving page load times.</p>
</li>
<li><p><strong>Improved availability</strong>: CDNs can help ensure that content is always available by replicating it across multiple servers. If one server fails or becomes overloaded, the request can be automatically routed to another server.</p>
</li>
<li><p><strong>Reduced network congestion</strong>: By serving content from a nearby server, CDNs can reduce the amount of traffic that needs to travel over long distances, which can help reduce network congestion and improve overall network performance.</p>
</li>
<li><p><strong>Caching</strong>: CDNs can cache frequently accessed content on edge servers, which can reduce the load on origin servers and improve page load times for subsequent requests.</p>
</li>
<li><p><strong>Security</strong>: CDNs can provide security features such as DDoS protection and SSL encryption to help protect against attacks and improve the security of web content.</p>
</li>
</ol>
<p>Overall, CDNs are an important tool for improving the performance and availability of web content, particularly for websites or applications with a global user base. By leveraging a distributed network of servers, CDNs can help reduce latency, improve availability, and enhance overall user experience.</p>
<hr />
<p>That was most of the basics cleared out of the way before we move on to the backend dev blog. We have to touch on a few bits and bobs, which we will take care of in the next video. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Web Basics - Part 2]]></title><description><![CDATA[Introduction to Protocols
Protocols are essential to modern communication, enabling devices to exchange information over networks in a standardized and reliable manner. Put simply, a protocol is a set of rules that governs how data is transmitted and...]]></description><link>https://highonbugs.sbk2k1.in/web-basics-part-2</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/web-basics-part-2</guid><category><![CDATA[protocols]]></category><category><![CDATA[TCP]]></category><category><![CDATA[ip address]]></category><category><![CDATA[http]]></category><category><![CDATA[smtp]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 12 Mar 2023 21:12:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678653909464/2f9c2799-5bef-472a-aa0c-fa0670ba9e2c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-introduction-to-protocols">Introduction to Protocols</h3>
<p>Protocols are essential to modern communication, enabling devices to exchange information over networks in a standardized and reliable manner. Put simply, a protocol is a set of rules that governs how data is transmitted and received between different devices on a network. These rules ensure that data is transmitted correctly and that devices are able to understand and interpret the data they receive.</p>
<p>The need for protocols arose as computer networks became more widespread, with different types of devices and software needing to communicate with each other over a common network. After the invention of the ARPANet, multiple networks were created, leading to compatibility issues and communication problems between different devices. This problem was addressed by the development of standardized protocols that could be used across different networks and devices.</p>
<p>Today, there are numerous networking protocols in use, ranging from the well-known HTTP protocol used for transmitting data over the web, to the complex TCP/IP protocol suite that forms the backbone of the internet. Understanding these protocols is essential for anyone who works with computers or computer networks, as it allows them to troubleshoot problems, optimize network performance, and ensure that their systems are secure and reliable.</p>
<h3 id="heading-types-of-internet-protocols">Types of Internet Protocols</h3>
<p>There are multiple Internet Protocols. Each of them is described in detail here:</p>
<ol>
<li><p><strong>TCP/IP</strong> is a protocol suite that provides the fundamental communication protocols for the internet and most modern computer networks. It was developed in the 1970s by the United States Department of Defense, and it is now widely used in both public and private networks. These are a set of standard rules that allows different types of computers to communicate with each other. The IP protocol ensures that each computer that is connected to the Internet is having a specific serial number called the <strong>IP address</strong>. TCP specifies how data is exchanged over the internet and how it should be broken into IP packets. It also makes sure that the packets have information about the source of the message data, the destination of the message data, the sequence in which the message data should be re-assembled, and checks if the message has been sent correctly to the specific destination. The TCP is also known as a connection-oriented protocol. TCP/IP is composed of four layers, each of which performs a specific set of functions:</p>
<ol>
<li><p><strong>Application layer</strong>: The application layer is the top layer of the TCP/IP protocol stack. It is responsible for handling the specific protocols used by applications, such as HTTP for web browsing, SMTP for email, and FTP for file transfers.</p>
</li>
<li><p><strong>Transport layer</strong>: The transport layer is responsible for ensuring that data is transmitted reliably and accurately between devices. This layer uses two protocols: TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). TCP is a connection-oriented protocol that provides reliable transmission of data, while UDP is connectionless and is used for applications that require the fast and lightweight transmission of data.</p>
</li>
<li><p><strong>Internet layer</strong>: The internet layer is responsible for addressing and routing data across networks. This layer uses the Internet Protocol (IP) to assign unique addresses to devices and to determine the best path for data to travel from the source to the destination.</p>
</li>
<li><p>Link layer: The link layer is the lowest layer of the TCP/IP protocol stack. It is responsible for transmitting data over physical media, such as Ethernet or Wi-Fi. This layer is responsible for data framing, error detection, and flow control.</p>
</li>
</ol>
</li>
</ol>
<p>    Together, these four layers of the TCP/IP protocol suite provide a standardized set of protocols that allow devices to communicate with each other over networks, regardless of their underlying hardware or software.</p>
<p>    <img src="https://i.ytimg.com/vi/MVihcigDlbA/maxresdefault.jpg" alt="Network Protocols and the 4 Layer Model - YouTube" /></p>
<ol>
<li><p><strong>HTTP</strong> (Hypertext Transfer Protocol) is a protocol used for transmitting data over the World Wide Web. It is a part of the application layer of the TCP/IP protocol suite. HTTP is used by web browsers, such as Google Chrome and Mozilla Firefox, to access web pages on the internet. HTTP works by establishing a connection between the user's computer and the web server hosting the web page, after which the server sends the web page data to the user's computer. HTTP supports a variety of data types, including text, images, and video. This protocol defines how the information needs to be formatted and transmitted. And, it also defines the various actions the web browsers should take in response to the calls made to access a  particular web page.</p>
</li>
<li><p><strong>FTP</strong> (File Transfer Protocol) is a protocol used for transferring files between two devices over a network. It is a part of the application layer of the TCP/IP protocol suite. FTP is used by users to upload and download files from a remote server. FTP works by establishing a connection between the user's computer and the remote server, after which the user can transfer files to and from the server. When a machine requests for file transfer from another machine, the FTO sets up a connection between the two and authenticates each other using their ID and Password. And, the desired file transfer takes place between the machines.</p>
</li>
<li><p><strong>SMTP</strong> (Simple Mail Transfer Protocol) is a protocol used for sending and receiving email messages over the internet. It is a part of the application layer of the TCP/IP protocol suite. SMTP is used by email clients, such as Microsoft Outlook, to send emails to an SMTP server, which then forwards the email to its intended recipient. SMTP works by establishing a connection between the email client and the SMTP server, after which the client sends the email message to the server, which then relays the message to the recipient's email server. This protocol uses the header of the mail to get the email id of the receiver and enters the mail into the queue of outgoing mail. And as soon as, it delivers the mail to the receiving email id, it removes the email from the outgoing list. The message or the electronic mail may consider the text, video, image, etc. It helps in setting up some communication server rules.</p>
</li>
<li><p><strong>SFTP(Secure File Transfer Protocol):</strong> SFTP which is also known as SSH FTP refers to File Transfer Protocol (FTP) over Secure Shell (SSH) as it encrypts both commands and data while in transmission. SFTP acts as an extension to SSH and encrypts files and data then sends them over a secure shell data stream. This protocol is used to remotely connect to other systems while executing commands from the command line.</p>
</li>
<li><p><strong>HTTPS(HyperText Transfer Protocol Secure):</strong> HTTPS is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network with the SSL/TLS protocol for encryption and authentication. So, generally, a  website has an HTTP protocol but if the website is such that it receives some sensitive information such as credit card details, debit card details, OTP, etc then it requires an SSL certificate installed to make the website more secure. So, before entering any sensitive information on a website, we should check if the link is HTTPS or not. If it is not HTTPS then it may not be secure enough to enter sensitive information.</p>
</li>
<li><p><strong>SFTP(Secure File Transfer Protocol):</strong> SFTP which is also known as SSH FTP refers to File Transfer Protocol (FTP) over Secure Shell (SSH) as it encrypts both commands and data while in transmission. SFTP acts as an extension to SSH and encrypts files and data then sends them over a secure shell data stream. This protocol is used to remotely connect to other systems while executing commands from the command line.</p>
</li>
<li><p><strong>ICMP</strong> (Internet Control Message Protocol) is a network protocol that is used to send error messages and operational information about network conditions. It is an integral part of the Internet Protocol (IP) suite and is used to help diagnose and troubleshoot issues with network connectivity. ICMP messages are typically generated by network devices, such as routers, in response to errors or exceptional conditions encountered in forwarding a datagram. Some examples of ICMP messages include:</p>
<ul>
<li><p>Echo Request and Echo Reply (ping)</p>
</li>
<li><p>Destination Unreachable</p>
</li>
<li><p>Time Exceeded</p>
</li>
<li><p>Redirect</p>
</li>
</ul>
</li>
</ol>
<p>    ICMP can also be used by network management tools to test the reachability of a host and measure the round-trip time for packets to travel from the source to the destination and back. It should be noted that ICMP is not a secure protocol, it can be used in some types of network attacks like DDoS amplification.</p>
<ol>
<li><p><strong>UDP</strong> (User Datagram Protocol) is a connectionless, unreliable transport layer protocol. Unlike TCP, it does not establish a reliable connection between devices before transmitting data, and it does not guarantee that data packets will be received in the order they were sent or that they will be received at all. Instead, UDP simply sends packets of data to a destination without any error checking or flow control. UDP is typically used for real-time applications such as streaming video and audio, online gaming, and VoIP (Voice over Internet Protocol) where a small amount of lost data is acceptable and low latency is important. UDP is faster than TCP because it has less overhead. It doesn’t need to establish a connection, so it can send data packets immediately. It also doesn’t need to wait for confirmation that the data was received before sending more, so it can transmit data at a higher rate.</p>
</li>
<li><p><strong>IMAP</strong> (Internet Message Access Protocol) is a protocol used for retrieving emails from a mail server. It allows users to access and manage their emails on the server, rather than downloading them to a local device. This means that the user can access their emails from multiple devices and the emails will be synced across all devices. IMAP is more flexible than POP3 (Post Office Protocol version 3) as it allows users to access and organize their emails on the server, and also allows multiple users to access the same mailbox.</p>
</li>
</ol>
<hr />
<p><strong>NOTE:</strong> An SSL (Secure Sockets Layer) certificate is a digital certificate that encrypts and authenticates data transmission over the internet. It is used to establish a secure connection between a web server and a web browser or other client software, ensuring that all data transferred between the two is encrypted and cannot be intercepted or tampered with by unauthorized third parties.</p>
<p>SSL certificates are issued by trusted third-party Certificate Authorities (CAs) after verifying the identity of the website owner and ensuring that they have the legal right to use the domain name associated with the website. Once installed on the web server, the SSL certificate activates the HTTPS protocol, which adds an additional layer of security to the standard HTTP protocol used for web communication.</p>
<p>When a user accesses a website with an SSL certificate, their web browser checks the certificate to ensure that it is valid and issued by a trusted CA. If the certificate is valid, the browser initiates a secure session with the website, which encrypts all data transmitted between the two parties using a cryptographic key.</p>
<p>SSL certificates are essential for protecting sensitive information, such as personal data, login credentials, and financial transactions, from interception or theft by hackers or other malicious actors. They are widely used by e-commerce sites, banks, healthcare providers, and other organizations that handle sensitive information online.</p>
<hr />
<h3 id="heading-are-all-these-protocols-children-of-tcpip">Are all these protocols children of TCP/IP?</h3>
<p>Not all of the protocols I mentioned are directly related to TCP/IP, but many of them are used in conjunction with TCP/IP. For example, DNS, DHCP, SNMP, and SSH all operate at the network layer or transport layer of the TCP/IP protocol stack. SMTP, POP3, and IMAP operate at the application layer, which is the top layer of the TCP/IP protocol stack. RTP and RTSP are often used in conjunction with TCP/IP for streaming media over the internet. SIP is also an application layer protocol that is used for voice and video communication over IP networks. So while these protocols may not all be direct descendants of TCP/IP, they are often used together with TCP/IP to enable communication over computer networks and the internet.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:875/1*pb-b83P8BATBvQSTUiyGkQ.png" alt /></p>
<p><strong>NOTE</strong>: <strong>Not all the Protocols are mentioned. Only the important ones are listed above.</strong> <strong>Basic Knowledge about these protocols is enough to get started with Web Dev.</strong></p>
<h3 id="heading-some-other-important-protocols">Some other important protocols.</h3>
<ol>
<li><p><strong>IPv4 and IPv6</strong>: IPv4 (Internet Protocol version 4) and IPv6 (Internet Protocol version 6) are two different versions of the Internet Protocol, which is the fundamental protocol that is used for communication over the internet. IPv4 is the older of the two protocols and has been in use since the early days of the internet. It uses 32-bit addresses and is capable of supporting up to about 4.3 billion unique addresses. However, with the explosion of internet-connected devices in recent years, the available pool of IPv4 addresses has been rapidly depleted, leading to the adoption of IPv6. IPv6 uses 128-bit addresses, which allows for an almost unlimited number of unique addresses (about 340 undecillion, which is a number with 38 zeros). This is enough to provide every device on the planet with a unique address and allow for the continued growth of the internet. IPv6 also includes several other improvements over IPv4, including better support for quality of service (QoS) and security, as well as simpler address allocation and configuration. While IPv6 has been available for many years, the adoption of the new protocol has been slow due to the need to upgrade existing network infrastructure and devices to support the new standard. However, as the pool of available IPv4 addresses continues to shrink, the need for widespread adoption of IPv6 is becoming more urgent.</p>
</li>
<li><p><strong>SSH:</strong> SSH (Secure Shell) is a protocol used for secure remote login and other secure network services. It provides a secure and encrypted way to remotely access and manage servers, network devices, and other computer systems. SSH uses public-key cryptography to authenticate the user and encrypt the data being transmitted, making it much more secure than traditional remote login protocols such as Telnet. SSH also allows for secure file transfers using the SCP (Secure Copy) and SFTP (Secure File Transfer Protocol) protocols. It is widely used in Unix-based operating systems and is also available for Windows. It is commonly used by system administrators, developers, and other technical users to remotely access and manage servers and other network devices.</p>
</li>
</ol>
<hr />
<p>We will talk more about IPs, Domains, Subnet Masks, and other related concepts in the next blog. This was just a packed blog containing all the basic theory stuff as a refresher. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Web Basics - Part 1]]></title><description><![CDATA[The History of the Web
The internet started as a research project in the late 1960s by the United States Department of Defense's Advanced Research Projects Agency (ARPA). The goal was to create a network that would allow researchers at different univ...]]></description><link>https://highonbugs.sbk2k1.in/web-basics-1</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/web-basics-1</guid><category><![CDATA[Web Development]]></category><category><![CDATA[WWW]]></category><category><![CDATA[web]]></category><category><![CDATA[Web3]]></category><category><![CDATA[webdev]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 12 Mar 2023 13:40:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678625379568/3606408e-0b4b-45e0-a31f-f4b19ca5b116.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-the-history-of-the-web">The History of the Web</h3>
<p>The internet started as a research project in the late 1960s by the United States Department of Defense's Advanced Research Projects Agency (ARPA). The goal was to create a network that would allow researchers at different universities and institutions to communicate and share information more efficiently.</p>
<p>The initial version of the internet, called ARPANET, was created in 1969 and connected four universities in the United States. It used packet-switching technology to transmit data between computers, which allowed information to be broken down into small packets and sent across the network. This was a significant improvement over earlier communication systems that used dedicated point-to-point connections.</p>
<p>Over the next few decades, the internet grew and evolved rapidly. In the 1980s, the development of the World Wide Web (WWW) by Tim Berners-Lee at CERN in Switzerland allowed users to access and share information using hypertext links. This made the internet much more user-friendly and accessible to a broader audience.</p>
<p>In the 1990s, the commercialization of the internet began, and companies started building websites and offering online services to consumers. The development of web browsers and search engines made it easier for users to find and navigate the web.</p>
<p>Today, the internet is a vast network of interconnected computers and servers, and it's an essential part of modern life. From email and social media to online shopping and streaming video, the internet has transformed the way we communicate, work, and interact with the world around us.</p>
<h3 id="heading-the-arrival-of-tcpip">The arrival of TCP/IP</h3>
<p>After the emergence of individual networks like ARPANET, one of the main problems was that these networks were using different protocols and technologies. This made it difficult to connect with them and share information with them. For example, a computer on one network might not be able to communicate with a computer on another network because they were using different communication protocols.</p>
<p>To solve this problem, the TCP/IP protocol was developed in the 1970s. TCP/IP stands for Transmission Control Protocol/Internet Protocol, and it's a set of standards for transmitting data over networks, including the internet. TCP is responsible for breaking data into packets, reassembling them at the destination, and ensuring that they arrive in the correct order. IP, on the other hand, is responsible for addressing and routing packets across the network.</p>
<p>By using a common set of standards, TCP/IP made it possible to connect different networks and communicate between them. This laid the foundation for the internet as we know it today.</p>
<p>The birth of the World Wide Web was another significant development in the history of the internet. In 1989, Tim Berners-Lee, a computer scientist at CERN in Switzerland, proposed a new way of sharing and accessing the information on the internet using hypertext links. This was the beginning of the World Wide Web.</p>
<p>Berners-Lee developed three key technologies to make the web work: HTML (Hypertext Markup Language), which is used to create web pages; HTTP (Hypertext Transfer Protocol), which is used to transfer data between web servers and clients; and URLs (Uniform Resource Locators), which are used to identify and locate web pages on the internet.</p>
<p>These technologies made it possible to create and share information on the web in a way that was easy to use and accessible to a broad audience. The web quickly grew in popularity and became an essential part of the internet, leading to the development of new technologies and applications that continue to shape the way we use and interact with the internet today.</p>
<h3 id="heading-contributions-of-sir-tim-berners-lee">Contributions of Sir Tim Berners Lee</h3>
<p>Tim Berners-Lee is the inventor of the World Wide Web, which revolutionized the way we share and access information on the internet. Berners-Lee developed three key technologies to make the web work:</p>
<ol>
<li><p><strong>HTML (Hypertext Markup Language):</strong> HTML is a markup language used to create web pages. It provides a standardized way of structuring content on the web, using tags and attributes to define headings, paragraphs, images, links, and other elements of a web page. HTML allows web developers to create rich, interactive content that can be viewed and accessed by anyone with an internet connection.</p>
</li>
<li><p><strong>HTTP (Hypertext Transfer Protocol):</strong> HTTP is the protocol used to transfer data between web servers and clients. It enables clients (such as web browsers) to request web pages and other resources from servers, and it allows servers to respond with the requested data. HTTP is a stateless protocol, which means that each request and response is independent of any previous requests or responses. This allows for faster and more efficient communication between clients and servers.</p>
</li>
<li><p><strong>URLs (Uniform Resource Locators):</strong> URLs are used to identify and locate resources on the web, such as web pages, images, and other files. A URL consists of several parts, including the protocol (HTTP or HTTPS), the domain name of the server, and the path to the resource on the server. URLs make it easy for users to navigate the web and access the content they're looking for.</p>
</li>
</ol>
<p>Together, these technologies provide the foundation for the World Wide Web and enable users to create, share, and access information on the internet. They have played a crucial role in shaping the modern internet and continue to evolve and improve as technology advances.</p>
<h3 id="heading-web-trilogy">Web Trilogy</h3>
<p>The transition from Web1 to Web2 to Web3 represents a shift in the way the internet is used and accessed, as well as the technologies that underpin it. Here is a brief overview of each phase:</p>
<ol>
<li><p><strong>Web1</strong>: The first phase of the web, also known as the static web, was characterized by static HTML pages that provided basic information and limited interaction. Users could browse web pages but could not interact with them beyond clicking on links.</p>
</li>
<li><p><strong>Web2</strong>: The second phase of the web, also known as the social web, saw the emergence of dynamic and interactive web pages that enabled user-generated content, social networking, and e-commerce. Web2 also saw the rise of mobile devices and the use of cloud computing to provide scalable and flexible services.</p>
</li>
<li><p><strong>Web3</strong>: The third phase of the web, also known as the decentralized web or web3, is characterized by a move towards decentralization, blockchain technology, and the use of cryptocurrencies. Web3 aims to provide users with greater control over their data and digital identities and to create decentralized applications that are more secure and resilient.</p>
</li>
</ol>
<p>Centralization vs decentralization is a key theme in the evolution of the web. Centralization refers to the concentration of power and control in the hands of a few large organizations, such as social media giants like Facebook or Twitter. Decentralization, on the other hand, involves distributing power and control across a network of users, creating a more democratic and resilient system.</p>
<p>Early examples of decentralization include Napster, a peer-to-peer file-sharing service that allowed users to share music files directly with each other, and BitTorrent, a decentralized file-sharing protocol that allowed users to download and share files without relying on a centralized server. These technologies represented a shift away from traditional client-server architectures and towards a more distributed and peer-to-peer approach.</p>
<p>Overall, the transition from Web1 to Web2 to Web3 represents an ongoing evolution of the web, driven by changing user needs and technological advances.</p>
<p><strong>NOTE: This is just a short blog on the history of the web. This is just an introduction blog and some theory before the rest of the Web Basics blog and the Backend Development Blog.</strong></p>
<h3 id="heading-evolution-of-web-technologies">Evolution of Web Technologies</h3>
<p>The evolution of web technologies has been a key driver in the development of the web over the past several decades. Here is a brief overview of some of the key technologies that have shaped the web:</p>
<ol>
<li><p><strong>HTML</strong>: Hypertext Markup Language (HTML) is a markup language used to create web pages. The first version of HTML was introduced in 1991, and it has since evolved through several versions, with HTML5 being the most recent. HTML is the foundation of the web, providing the structure and content of web pages.</p>
</li>
<li><p><strong>CSS</strong>: Cascading Style Sheets (CSS) is a language used to define the visual style of web pages, including layout, fonts, colors, and more. CSS was introduced in 1996 as a way to separate the presentation of web pages from their content, allowing for greater flexibility and control.</p>
</li>
<li><p><strong>JavaScript</strong>: JavaScript is a programming language used to add interactivity and functionality to web pages. It was introduced in 1995 and has since become one of the most widely used programming languages in the world. JavaScript allows developers to create dynamic and responsive web pages, as well as to build complex web applications.</p>
</li>
<li><p><strong>XML</strong>: Extensible Markup Language (XML) is a markup language used to describe data and its structure. It was introduced in 1998 as a way to provide a standardized format for exchanging data over the web. XML is often used in web services and APIs to exchange data between different systems.</p>
</li>
<li><p><strong>AJAX</strong>: Asynchronous JavaScript and XML (AJAX) is a technique used to create fast and dynamic web applications. It allows web pages to update content without requiring a full page reload, making web applications feel more like desktop applications. AJAX was introduced in 2005 and has since become a popular technique for building web applications.</p>
</li>
</ol>
<p>These are just a few examples of the many technologies that have shaped the evolution of the web over the years. As technology continues to evolve, we can expect to see even more innovations in the years to come. You will get to see a lot more technologies than the ones mentioned here in the blogs coming ahead.</p>
<hr />
<p>If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[Getting into Node.js]]></title><description><![CDATA[Node.js is an open-source, cross-platform, back-end JavaScript runtime environment that is built on Chrome's V8 JavaScript engine. It was created by Ryan Dahl in 2009 and is maintained by the Node.js Foundation. Node.js is used to build fast, scalabl...]]></description><link>https://highonbugs.sbk2k1.in/nodejs</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/nodejs</guid><category><![CDATA[Node.js]]></category><category><![CDATA[node]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Event Loop]]></category><category><![CDATA[asynchronous programming]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Tue, 07 Mar 2023 14:37:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678194626769/94b743da-75f7-4b83-ad28-c9c65faff760.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Node.js is an open-source, cross-platform, back-end JavaScript runtime environment that is built on Chrome's V8 JavaScript engine. It was created by Ryan Dahl in 2009 and is maintained by the Node.js Foundation. Node.js is used to build fast, scalable network applications and web servers, and its popularity has been growing rapidly since its inception.</p>
<p>Node.js was created to address the need for a non-blocking, event-driven I/O model that can handle large amounts of data and connections with high performance. Node.js uses an event loop that listens for incoming requests and responds to them asynchronously, which makes it ideal for handling real-time, data-intensive applications.</p>
<p>Node.js is built on top of Google's V8 JavaScript engine, which compiles JavaScript code into native machine code that can be executed directly by the computer's processor. This means that Node.js can execute JavaScript code much faster than other interpreted languages like PHP or Ruby.</p>
<p><strong>NOTE: Nodejs is not a framework. It is a runtime environment that is the basis for other frameworks like Express.</strong></p>
<h3 id="heading-features">Features</h3>
<p>Some features of Nodejs are:</p>
<ol>
<li><p><strong>Asynchronous and event-driven</strong>: Node.js is built on an event-driven, non-blocking I/O model that makes it lightweight and efficient. (more on this later)</p>
</li>
<li><p><strong>Cross-platform</strong>: Node.js is built to work on a variety of platforms, including Windows, Mac OS X, and Linux.</p>
</li>
<li><p><strong>Fast</strong>: Node.js is built on the V8 JavaScript engine, which is known for its high performance and speed.</p>
</li>
<li><p><strong>Scalable</strong>: Node.js is designed to handle large-scale applications with ease, thanks to its non-blocking I/O model and event-driven architecture.</p>
</li>
<li><p><strong>Extensible</strong>: Node.js has a large ecosystem of modules and libraries that can be easily installed and used in your applications.</p>
</li>
<li><p><strong>Server-side scripting</strong>: Node.js is often used for server-side scripting, allowing developers to build powerful server-side applications using JavaScript.</p>
</li>
<li><p><strong>Easy to learn</strong>: With its simple syntax and extensive documentation, Node.js is easy to learn for developers who are familiar with JavaScript.</p>
</li>
<li><p><strong>Community-driven</strong>: Node.js has a large and active community of developers, who are constantly contributing to the development of new tools, modules, and libraries.</p>
</li>
<li><p><strong>Open source</strong>: Node.js is an open-source platform, which means that anyone can contribute to its development and use it for free.</p>
</li>
</ol>
<h3 id="heading-nodejs-architecture">Nodejs Architecture</h3>
<p>Let us have a look at the Nodejs architecture, which will help us understand quite a lot of its features.</p>
<p><img src="https://litslink.com/wp-content/uploads/2021/07/Node.js-Architecture-Chart.png" alt="Node.js Architecture From A to Z. Use Cases, Advantages, Big Players |  LITSLINK Blog" /></p>
<p>Node.js architecture is based on an event-driven, non-blocking I/O model that makes it highly efficient and scalable. The architecture consists of several components:</p>
<ol>
<li><p>V8 Engine: Node.js uses the Google V8 JavaScript engine to execute JavaScript code. This engine compiles JavaScript into native machine code that can be executed directly by the CPU, making it faster than traditional interpreters.</p>
</li>
<li><p>Libuv: Node.js uses the Libuv library to provide an event loop and other asynchronous I/O capabilities. This library is also responsible for handling file system operations, networking, and other system-level functions.</p>
</li>
<li><p>Node.js APIs: Node.js provides a number of built-in APIs for interacting with the file system, networking, and other system-level functions. These APIs are written in C/C++ and are exposed to JavaScript via bindings.</p>
</li>
<li><p>Node.js Runtime: The Node.js runtime is the environment in which Node.js applications run. It provides a JavaScript execution environment, as well as access to system resources such as the file system, network, and operating system.</p>
</li>
<li><p>Operating System: Node.js applications run on top of an operating system, such as Linux, Windows, or macOS. The operating system provides access to hardware resources, such as the CPU, memory, and network interfaces.</p>
</li>
</ol>
<p>Node.js uses an event loop to handle incoming requests and execute non-blocking I/O operations. The event loop is a loop that constantly checks for events and executes the associated callback functions. When a new request comes in, it is added to the event queue, and when the event loop reaches that event, it executes the associated callback function. This means that Node.js can handle many requests simultaneously, without blocking the execution of other requests.</p>
<p>Node.js achieves this by offloading time-consuming I/O operations, such as reading or writing to the file system or making network requests, to a separate thread pool. When an I/O operation is requested, Node.js adds it to a queue and sends it to the thread pool, freeing up the main thread to continue processing other requests. When the operation is complete, the thread pool returns the result to the main thread, which then executes the callback function.</p>
<p>Although Node.js uses a thread pool to offload I/O operations, it is still considered single-threaded because all user code runs on a single thread, and the thread pool is managed internally by Node.js. Additionally, because Node.js uses non-blocking I/O operations, the thread pool is only used for time-consuming tasks, and most operations can be handled by the main event loop.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=6YgsqXlUoTM">https://www.youtube.com/watch?v=6YgsqXlUoTM</a></div>
<p> </p>
<p>Check out this video if you want to know about the event loop in detail. Or if you have half an hour you can go for the one provided below.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=8aGhZQkoFbQ">https://www.youtube.com/watch?v=8aGhZQkoFbQ</a></div>
<p> </p>
<h3 id="heading-what-are-events">What are Events?</h3>
<p>Events in Node.js are a mechanism used to handle and respond to actions or signals that occur asynchronously in a program. In Node.js, the <code>EventEmitter</code> class is used to implement the event-driven architecture. The EventEmitter allows developers to create custom events and handle them with corresponding listeners.</p>
<p>Events in Node.js work on a publisher-subscriber model, where publishers emit events and subscribers listen to those events. When an event is emitted, all subscribers to that event are notified and can respond accordingly. This makes it possible for Node.js applications to be highly responsive to user input and other events that occur in the system.</p>
<p>Node.js has a built-in <code>events</code> module that provides the EventEmitter class and other related utilities for working with events. The <code>events</code> module can be used to implement complex event-driven systems such as web servers, real-time chat applications, and other systems that need to handle multiple concurrent connections.</p>
<h3 id="heading-the-non-blocking-model">The Non-Blocking model</h3>
<p>This is one of the features of the Nodejs architecture. Let us first understand what blocking code vs non-blocking code is from GeeksforGeeks.</p>
<p><strong>Blocking:</strong> It refers to the blocking of further operations until the current operation finishes. Blocking methods are executed synchronously. Synchronously means that the program is executed line by line. The program waits until the called function or the operation returns.</p>
<p><strong>Example:</strong> The following example uses the <a target="_blank" href="https://www.geeksforgeeks.org/node-js-fs-readfilesync-method/"><strong>readFileSync()</strong></a> function to read files and demonstrate Blocking in Node.js</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> fs = <span class="hljs-built_in">require</span>(<span class="hljs-string">'fs'</span>);

<span class="hljs-keyword">const</span> filepath = <span class="hljs-string">'text.txt'</span>;

<span class="hljs-comment">// Reads a file in a synchronous and blocking way</span>
<span class="hljs-keyword">const</span> data = fs.readFileSync(filepath, {<span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span>});

<span class="hljs-comment">// Prints the content of file</span>
<span class="hljs-built_in">console</span>.log(data);

<span class="hljs-comment">// This section calculates the sum of numbers from 1 to 10</span>
<span class="hljs-keyword">let</span> sum = <span class="hljs-number">0</span>;
<span class="hljs-keyword">for</span>(<span class="hljs-keyword">let</span> i=<span class="hljs-number">1</span>; i&lt;=<span class="hljs-number">10</span>; i++){
    sum = sum + i;
}

<span class="hljs-comment">// Prints the sum</span>
<span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Sum: '</span>, sum);
</code></pre>
<p>On running the <strong>index.js</strong> file use the following command:</p>
<pre><code class="lang-bash">node index.js
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-bash">This is from text file.
Sum:  55
</code></pre>
<p><strong>Non-Blocking:</strong> It refers to the program that does not block the execution of further operations. Non-Blocking methods are executed asynchronously. Asynchronously means that the program may not necessarily execute line by line. The program calls the function and moves to the next operation and does not wait for it to return.</p>
<p><strong>Example:</strong> The following example uses the <a target="_blank" href="https://www.geeksforgeeks.org/node-js-fs-readfile-method/"><strong>readFile()</strong></a> function to read files and demonstrate Non-Blocking in Node.js</p>
<p>Run the <strong>index.js</strong> file using the following command:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> fs = <span class="hljs-built_in">require</span>(<span class="hljs-string">'fs'</span>);

<span class="hljs-keyword">const</span> filepath = <span class="hljs-string">'text.txt'</span>;

<span class="hljs-comment">// Reads a file in a asynchronous and non-blocking way</span>
fs.readFile(filepath, {<span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span>}, <span class="hljs-function">(<span class="hljs-params">err, data</span>) =&gt;</span> {
    <span class="hljs-comment">// Prints the content of file</span>
    <span class="hljs-built_in">console</span>.log(data);
});


<span class="hljs-comment">// This section calculates the sum of numbers from 1 to 10</span>
<span class="hljs-keyword">let</span> sum = <span class="hljs-number">0</span>;
<span class="hljs-keyword">for</span>(<span class="hljs-keyword">let</span> i=<span class="hljs-number">1</span>; i&lt;=<span class="hljs-number">10</span>; i++){
    sum = sum + i;
}

<span class="hljs-comment">// Prints the sum</span>
<span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Sum: '</span>, sum);
</code></pre>
<p>On running the <strong>index.js</strong> file use the following command:</p>
<pre><code class="lang-bash">node index.js
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-bash">Sum:  55
This is from text file.
</code></pre>
<p>In the non-blocking program, the sum actually prints before the content of the file. This is because the program does not wait for the readFile() function to return and move to the next operation. And when the readFile() function returns it prints the content.</p>
<h3 id="heading-how-does-nodejs-handle-this">How does Nodejs handle this?</h3>
<p>Node.js uses an event-driven, non-blocking I/O model to handle both blocking and non-blocking code. When a blocking operation is called, Node.js delegates the operation to the system kernel, which frees up the event loop to handle other tasks. Once the operation is complete, Node.js will handle the callback function in the event loop. This allows Node.js to execute other code while waiting for the operation to complete.</p>
<p>On the other hand, non-blocking code is executed immediately, and the result is passed to the callback function when the operation is complete. This allows Node.js to continue executing other code while waiting for the non-blocking operation to complete.</p>
<p>Node.js also has a thread pool that is used for some built-in modules like <code>crypto</code>, <code>zlib</code>, and <code>fs</code> module synchronous functions. These functions are executed outside of the main event loop to prevent blocking the execution of other code.</p>
<p>In summary, Node.js handles blocking and non-blocking code by delegating blocking operations to the system kernel and executing non-blocking code immediately while passing the result to a callback function. The thread pool is used to offload blocking synchronous functions from the main event loop to prevent blocking other code.</p>
<h3 id="heading-a-step-by-step-explanation">A step-by-step explanation.</h3>
<p>Thanks to Andrew Mead for explaining this in his course.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:875/1*BBlPbUjGVtfSPd7BHa1LHw.png" alt /></p>
<ol>
<li><p>Push <code>main()</code> onto the call stack.</p>
</li>
<li><p>Push <code>console.log()</code> onto the call stack. This then runs right away and gets popped.</p>
</li>
<li><p>Push <code>setTimeout(2000)</code> onto the stack. <code>setTimeout(2000)</code> is a Node API. When we call it, we register the event-callback pair. The event will wait 2000 milliseconds, then the callback is the function.</p>
</li>
<li><p>After registering it in the APIs, <code>setTimeout(2000)</code> gets popped from the call stack.</p>
</li>
<li><p>Now the second <code>setTimeout(0)</code> gets registered in the same way. We now have two Node APIs waiting to execute.</p>
</li>
<li><p>After waiting for 0 seconds, <code>setTimeout(0)</code> gets moved to the callback queue, and the same thing happens with <code>setTimeout(2000)</code>.</p>
</li>
<li><p>In the callback queue, the functions wait for the call stack to be empty because only one statement can execute at a time. This is taken care of by the event loop.</p>
</li>
<li><p>The last <code>console.log()</code> runs and the <code>main()</code> gets popped from the call stack.</p>
</li>
<li><p>The event loop sees that the call stack is empty and the callback queue is not empty. So it moves the callbacks (in a first-in-first-out order) to the call stack for execution.</p>
</li>
</ol>
<h3 id="heading-how-is-nodejs-different-from-normal-javascript">How is Nodejs different from normal Javascript?</h3>
<p>JavaScript (JS) and Node.js are both based on the same programming language, but they have some key differences:</p>
<ol>
<li><p><strong>Environment</strong>: JS runs in a browser, while Node.js runs in a server-side environment.</p>
</li>
<li><p><strong>Modules</strong>: JS has a limited module system, whereas Node.js has a powerful module system that enables developers to write modular and reusable code.</p>
</li>
<li><p><strong>APIs</strong>: JS has a set of APIs for manipulating the Document Object Model (DOM), while Node.js has APIs for file system operations, networking, and more.</p>
</li>
<li><p><strong>Global objects</strong>: JS has a set of global objects that are available in the browser environment, such as "window" and "document". Node.js has a different set of global objects that are available in the server-side environment, such as "process" and "console".</p>
</li>
<li><p><strong>Threading</strong>: JS is single-threaded, which means it can only execute one task at a time. Node.js is also single-threaded, but its sophisticated architecture helps it to handle more concurrent connections.</p>
</li>
<li><p><strong>Performance</strong>: JS is designed to run in a browser environment, so it is optimized for handling user interactions and rendering web pages. Node.js is optimized for handling I/O operations and network requests, so it is faster and more efficient for server-side tasks.</p>
</li>
</ol>
<p>These points may also lead to the question, "what's the difference between CommonJS and ES6?"</p>
<p>Let us have a look at CommonJS vs ES modules in depth:</p>
<table><tbody><tr><td><p><strong><em>Basis</em></strong></p></td><td><p><strong><em>CommonJS </em></strong></p></td><td><p><strong><em>ES Module</em> </strong></p></td></tr><tr><td><p>Functionality </p></td><td><p>Works with the Node.js platform </p></td><td><p>Works with the web browser environment </p></td></tr><tr><td><p>Compilation </p></td><td><p>Compiled into AMD modules </p></td><td><p>Does not require a module loader like AMD</p></td></tr><tr><td><p>Dependencies </p></td><td><p>All dependencies are listed in the same file </p></td><td><p>Reference any other module in the same package available on the global namespace </p></td></tr><tr><td><p>Type<strong>-</strong>checking </p></td><td><p>No type-checking capabilities </p></td><td><p>Robust typing support via imports</p></td></tr><tr><td><p>Dependency Packaging </p></td><td><p>Packaging up functionality into small pieces </p></td><td><p>Declare dependencies between modules </p></td></tr><tr><td><p>File Structure </p></td><td><p>Flat files </p></td><td><p>References to other modules </p></td></tr><tr><td><p>Export </p></td><td><p>Exports in the same file<strong> </strong></p></td><td><p>Exports scattered through the codebase </p></td></tr><tr><td><p>Import </p></td><td><p>No import functionality </p></td><td><p>Must use a require statement to access exported functions and properties</p></td></tr></tbody></table>

<p><strong>NOTE: In JavaScript, a module is a self-contained unit of code that defines a set of related functions, objects, and data. A module system is a way to organize and manage these modules in a codebase. CommonJS, ES6, and AMD are all module systems.</strong></p>
<h3 id="heading-why-is-nodejs-scalable">Why is Nodejs Scalable?</h3>
<p>Node.js is scalable because of the non-blocking I/O model (mentioned above), which allows it to handle multiple concurrent connections efficiently. In traditional server-side models, each incoming connection would block the server, leading to degraded performance under high traffic. However, Node.js's event-driven, single-threaded architecture enables it to handle large numbers of requests simultaneously without getting bogged down.</p>
<p>When a request comes in, Node.js places it in a queue and continues processing other requests. When the I/O operation is complete, Node.js notifies the event loop, which then executes the corresponding callback function. This approach allows Node.js to handle multiple requests at the same time without blocking the main thread, making it a highly scalable platform for building networked applications.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:780/0*kJ6vNeUGrJ0x115j." alt /></p>
<h3 id="heading-some-drawbacks">Some drawbacks.</h3>
<p>While Node.js has many advantages, it also has some disadvantages that developers should be aware of.</p>
<ol>
<li><p><strong>Not ideal for CPU-intensive applications</strong>: Node.js is designed to handle I/O-bound tasks, making it less efficient for applications that require a lot of CPU processing power. These types of applications could cause the event loop to block, slowing down the entire server.</p>
</li>
<li><p><strong>Asynchronous Programming</strong> is difficult to understand for new developers.</p>
</li>
<li><p><strong>Not suitable for heavy-load applications</strong>: While Node.js can handle a large number of concurrent connections, it may not be the best choice for extremely high-traffic applications, as it can become difficult to scale vertically.</p>
</li>
</ol>
<hr />
<p>This was mostly a theoretical blog on Nodejs. The main aim was to cram as much info into a single place to serve as a reference before interview prep. If you did find this helpful, stay tuned/like/all that good stuff. Will write on more topics. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[REST and REST API - Part 3]]></title><description><![CDATA[You will need to go through the 1st and 2nd blogs to get what's happening here.
Endpoint: https://hob-api.vercel.app/
GitHub repository: https://github.com/sbk2k1/API-Blog

/get Routes
GET HTTP requests are used to retrieve or fetch data from a serve...]]></description><link>https://highonbugs.sbk2k1.in/rest-part-3</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/rest-part-3</guid><category><![CDATA[REST API]]></category><category><![CDATA[REST]]></category><category><![CDATA[restful]]></category><category><![CDATA[APIs]]></category><category><![CDATA[#PostmanAPI]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Mon, 06 Mar 2023 08:54:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678044005456/4c51084a-2cc5-4fd5-a327-a4693c8a2003.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You will need to go through the <a target="_blank" href="https://highonbugs.sbk2k1.me/rest-part-1">1st</a> and <a target="_blank" href="https://highonbugs.sbk2k1.me/rest-part-2">2nd</a> blogs to get what's happening here.</p>
<p>Endpoint: <a target="_blank" href="https://hob-api.vercel.app/">https://hob-api.vercel.app/</a></p>
<p>GitHub repository: <a target="_blank" href="https://github.com/sbk2k1/API-Blog">https://github.com/sbk2k1/API-Blog</a></p>
<hr />
<h2 id="heading-get-routes">/get Routes</h2>
<p>GET HTTP requests are used to retrieve or fetch data from a server. This is typically used when a user requests to view a webpage or retrieve specific data from a server. Constraints include limited data storage capacity in the URL, which can impact the size of data that can be sent in a GET request, and the fact that GET requests are typically visible in browser histories and can be cached, which can impact security. Additionally, GET requests are not suitable for sending sensitive information, such as passwords or other authentication credentials, in the URL. A GET request implemented using REST does not have a request body.</p>
<p>Create a new request in Postman or your browser as shown below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678028074617/72f97f79-a25b-465a-b590-36ec181976b6.png" alt class="image--center mx-auto" /></p>
<p>Fire it off!</p>
<p>If you get a Status 200 OK response, you'll probably get a response body like this:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"error"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Get Request Received Successfully"</span>,
    <span class="hljs-attr">"query_parameters"</span>: {},
    <span class="hljs-attr">"bearer_token"</span>: <span class="hljs-string">"Bearer Token is absent"</span>,
    <span class="hljs-attr">"headers"</span>: {
        <span class="hljs-attr">"host"</span>: <span class="hljs-string">"hob-api.vercel.app"</span>,
        <span class="hljs-attr">"x-real-ip"</span>: <span class="hljs-string">"xx.xxx.xxx.xxx"</span>,
        <span class="hljs-attr">"x-vercel-proxy-signature-ts"</span>: <span class="hljs-string">"xxxxxxxxx8"</span>,
        <span class="hljs-attr">"x-vercel-deployment-url"</span>: <span class="hljs-string">"hob-o362hip79-sbk2k1.vercel.app"</span>,
        <span class="hljs-attr">"x-vercel-ip-latitude"</span>: <span class="hljs-string">"xx.xxx"</span>,
        <span class="hljs-attr">"x-vercel-forwarded-for"</span>: <span class="hljs-string">"xx.xxx.xxx.xxx"</span>,
        <span class="hljs-attr">"x-vercel-id"</span>: <span class="hljs-string">"bom1::xv4kk-1678028068902-ec68b76f258e"</span>,
        <span class="hljs-attr">"forwarded"</span>: <span class="hljs-string">"for=xx.2xx.xx5.xxx;host=hob-api.vercel.app;proto=https;sig=0QmVhcmVyIDVhMDRiOGY1MmNkNjE0YTE4Zjc4Yjc3Y2Q1YWU5NmYzYWQ0MzVjMWNXXXXXXXXXXXXXjc0MTcwOGU=;exp=1678028368"</span>,
        <span class="hljs-attr">"postman-token"</span>: <span class="hljs-string">"xxxfxxxx-5e47-xxxa-a51a-xxxxxx43a60e"</span>,
        <span class="hljs-attr">"x-vercel-ip-longitude"</span>: <span class="hljs-string">"x8.xxx2"</span>,
        <span class="hljs-attr">"accept"</span>: <span class="hljs-string">"*/*"</span>,
        <span class="hljs-attr">"x-forwarded-for"</span>: <span class="hljs-string">"xx.xxx.xxx.xxx"</span>,
        <span class="hljs-attr">"x-forwarded-host"</span>: <span class="hljs-string">"hob-api.vercel.app"</span>,
        <span class="hljs-attr">"x-vercel-ip-country"</span>: <span class="hljs-string">"IN"</span>,
        <span class="hljs-attr">"x-forwarded-proto"</span>: <span class="hljs-string">"https"</span>,
        <span class="hljs-attr">"x-vercel-proxy-signature"</span>: <span class="hljs-string">"Bearer 5a04b8f52xxxxxxxxxxxxxxxxxxxx6f3ad435cxxxxxxxxxdbecd0d01eaff741708e"</span>,
        <span class="hljs-attr">"accept-encoding"</span>: <span class="hljs-string">"gzip, deflate, br"</span>,
        <span class="hljs-attr">"user-agent"</span>: <span class="hljs-string">"PostmanRuntime/x.xx.4"</span>,
        <span class="hljs-attr">"x-vercel-ip-country-region"</span>: <span class="hljs-string">"WB"</span>,
        <span class="hljs-attr">"x-vercel-ip-city"</span>: <span class="hljs-string">"Kolkata"</span>,
        <span class="hljs-attr">"x-vercel-ip-timezone"</span>: <span class="hljs-string">"Asia/Kolkata"</span>,
        <span class="hljs-attr">"x-vercel-proxied-for"</span>: <span class="hljs-string">"xx.xxx.xxx.1x4"</span>,
        <span class="hljs-attr">"connection"</span>: <span class="hljs-string">"close"</span>
    }
}
</code></pre>
<p>Here's a breakdown of each part of the response:</p>
<ul>
<li><p><code>"error": false</code>: This indicates that there was no error in processing the request.</p>
</li>
<li><p><code>"message": "Get Request Received Successfully"</code>: This is a custom message returned by the server to indicate that the GET request was received successfully.</p>
</li>
<li><p><code>"query_parameters": {}</code>: This is an empty object indicating that there were no query parameters included in the GET request.</p>
</li>
<li><p><code>"bearer_token": "Bearer Token is absent"</code>: This indicates that no bearer token was included in the request.</p>
</li>
<li><p><code>"headers": { ... }</code>: This is an object containing information about the headers included in the GET request. Each key-value pair in this object represents a single header and its value.</p>
</li>
</ul>
<p>Some of the headers included in this response include:</p>
<ul>
<li><p><code>"host": "</code><a target="_blank" href="http://hob-api.vercel.app"><code>hob-api.vercel.app</code></a><code>"</code>: This is the hostname of the server that received the request.</p>
</li>
<li><p><code>"x-real-ip": "xx.xxx.xxx.xxx"</code>: This is the IP address of the client that made the request.</p>
</li>
<li><p><code>"x-vercel-deployment-url": "</code><a target="_blank" href="http://hob-o362hip79-sbk2k1.vercel.app"><code>hob-o362hip79-sbk2k1.vercel.app</code></a><code>"</code>: This is the URL of the Vercel deployment that is handling the request.</p>
</li>
<li><p><code>"x-forwarded-proto": "https"</code>: This indicates that the request was made over HTTPS.</p>
</li>
<li><p><code>"user-agent": "PostmanRuntime/7.28.4"</code>: This is the user agent string of the client that made the request.</p>
</li>
<li><p><code>"connection": "close"</code>: This indicates that the connection will be closed after the response is sent.</p>
</li>
</ul>
<h3 id="heading-setting-path-and-query-parameters-and-bearer-token">Setting path and query parameters and bearer token</h3>
<ol>
<li>Let us configure the URL as such: (You can also add query params from the params tab)</li>
</ol>
<pre><code class="lang-bash">https://hob-api.vercel.app/get/2?id=2&amp;isDarkMode=<span class="hljs-literal">true</span>
</code></pre>
<ol>
<li>Go to the "Headers" tab below the URL and set a new field called <code>Authorization</code> and set any value to it. I'm setting "This is my Authorization Token"</li>
</ol>
<p>This is what the request looks like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678028471343/cbfb6613-6c30-46d6-9b5e-6c26a27dd1ba.png" alt class="image--center mx-auto" /></p>
<p>On firing it, this is what it looks like,</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"error"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Get Request Received Successfully"</span>,
    <span class="hljs-attr">"path_parameter"</span>: <span class="hljs-string">"2 is the path parameter"</span>,
    <span class="hljs-attr">"query_parameters"</span>: {
        <span class="hljs-attr">"id"</span>: <span class="hljs-string">"2"</span>,
        <span class="hljs-attr">"isDarkMode"</span>: <span class="hljs-string">"true"</span>
    },
    <span class="hljs-attr">"bearer_token"</span>: <span class="hljs-string">"This is my Authorization Token"</span>,
    <span class="hljs-attr">"headers"</span>: {
        <span class="hljs-attr">"host"</span>: <span class="hljs-string">"hob-api.vercel.app"</span>,
        <span class="hljs-attr">"authorization"</span>: <span class="hljs-string">"This is my Authorization Token"</span>,
        <span class="hljs-attr">"x-real-ip"</span>: <span class="hljs-string">"45.250.245.184"</span>,
        <span class="hljs-attr">"x-vercel-proxy-signature-ts"</span>: <span class="hljs-string">"1678028816"</span>,
        <span class="hljs-attr">"x-vercel-deployment-url"</span>: <span class="hljs-string">"hob-o362hip79-sbk2k1.vercel.app"</span>,
        <span class="hljs-attr">"x-vercel-ip-latitude"</span>: <span class="hljs-string">"22.518"</span>,
        <span class="hljs-attr">"x-vercel-forwarded-for"</span>: <span class="hljs-string">"45.250.245.184"</span>,
        <span class="hljs-attr">"x-vercel-id"</span>: <span class="hljs-string">"bom1::xq858-1678028516261-e4cf9d6423ef"</span>,
        <span class="hljs-attr">"forwarded"</span>: <span class="hljs-string">"for=45.250.245.184;host=hob-api.vercel.app;proto=https;sig=0QmVhcmVyIDdmN2Y1MmI3NGNhMDRjNDUxYmQ1MDA5YzljOWNiNWMwMDIzMjJiZmZlYjc0MDZkMGE2MThiYjNkYmViNzQyY2M=;exp=1678028816"</span>,
        <span class="hljs-attr">"postman-token"</span>: <span class="hljs-string">"2c80ebc4-74cd-44c3-af96-72f27b0c67c5"</span>,
        <span class="hljs-attr">"x-vercel-ip-longitude"</span>: <span class="hljs-string">"88.3832"</span>,
        <span class="hljs-attr">"accept"</span>: <span class="hljs-string">"*/*"</span>,
        <span class="hljs-attr">"x-forwarded-for"</span>: <span class="hljs-string">"45.250.245.184"</span>,
        <span class="hljs-attr">"x-forwarded-host"</span>: <span class="hljs-string">"hob-api.vercel.app"</span>,
        <span class="hljs-attr">"x-vercel-ip-country"</span>: <span class="hljs-string">"IN"</span>,
        <span class="hljs-attr">"x-vercel-ip-country-region"</span>: <span class="hljs-string">"WB"</span>,
        <span class="hljs-attr">"x-vercel-proxy-signature"</span>: <span class="hljs-string">"Bearer 7f7f52b74ca04c451bd5009c9c9cb5c002322bffeb7406d0a618bb3dbeb742cc"</span>,
        <span class="hljs-attr">"accept-encoding"</span>: <span class="hljs-string">"gzip, deflate, br"</span>,
        <span class="hljs-attr">"user-agent"</span>: <span class="hljs-string">"PostmanRuntime/7.28.4"</span>,
        <span class="hljs-attr">"x-forwarded-proto"</span>: <span class="hljs-string">"https"</span>,
        <span class="hljs-attr">"x-vercel-ip-city"</span>: <span class="hljs-string">"Kolkata"</span>,
        <span class="hljs-attr">"x-vercel-ip-timezone"</span>: <span class="hljs-string">"Asia/Kolkata"</span>,
        <span class="hljs-attr">"x-vercel-proxied-for"</span>: <span class="hljs-string">"45.250.245.184"</span>,
        <span class="hljs-attr">"connection"</span>: <span class="hljs-string">"close"</span>
    }
}
</code></pre>
<p>Describing the fields and what they are used for:</p>
<ul>
<li><p><strong>Query parameters</strong>: These are extra data passed in the URL of an HTTP request to filter or modify the response. They are commonly used to paginate, sort, or filter data, and are often visible in the URL bar of the browser. They are passed through the endpoint using "?" followed by parameters separated by a "&amp;".</p>
</li>
<li><p><strong>Path parameters</strong>: These are parts of the URL that represent a dynamic value. They are often used to identify a specific resource and can be used to make more meaningful and memorable URLs. These parameters are set using any value after the endpoint route using a "/".</p>
</li>
<li><p><strong>Authentication header</strong>: This is a request header that contains an authentication token or credentials to prove the identity of the requester. It is commonly used to secure APIs and web applications and ensures that only authorized users have access to protected resources. They may contain encrypted data to check if the routes are accessible by a certain type of client entity.</p>
</li>
</ul>
<p><strong>NOTE: We will not be explaining path and query parameters or the authentication header/other headers anymore moving forward.</strong></p>
<hr />
<h2 id="heading-post-routes">/post Routes</h2>
<p>HTTP POST is a request method that is used to submit an entity to the specified resource, often causing a change in state or side effects on the server. The main purpose of the HTTP POST method is to create a new resource or to update an existing resource on the server.</p>
<p>The constraints for HTTP POST requests are:</p>
<ul>
<li><p>The request payload must contain the entity that will be created or updated on the server.</p>
</li>
<li><p>The POST method may cause side effects, such as the creation of a new resource or the modification of an existing one.</p>
</li>
<li><p>Unlike GET requests, POST requests are not cacheable.</p>
</li>
<li><p>POST requests are typically used for creating, updating, or deleting resources on the server, so they require proper authentication and authorization.</p>
</li>
</ul>
<p>Create a new request in Postman or your browser as shown below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678087885468/d4d1d221-18ce-471c-af84-ba33146ff55d.png" alt class="image--center mx-auto" /></p>
<p>Go to the Body tab, select "raw" and then "JSON" from the drop-down. Enter any random data or paste the following body</p>
<pre><code class="lang-json">{
<span class="hljs-attr">"field_1"</span>: <span class="hljs-number">1</span>,
<span class="hljs-attr">"name"</span>: <span class="hljs-string">"sbk2k1"</span>,
<span class="hljs-attr">"github"</span>: <span class="hljs-string">"github.com/sbk2k1"</span>
}
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678089879423/9417f18c-79d5-459b-8b2f-34c3252359a0.png" alt class="image--center mx-auto" /></p>
<p>Fire it off! The response looks like this if everything was 200 OK.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"error"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Post Request Received Successfully"</span>,
    <span class="hljs-attr">"body"</span>: {
        <span class="hljs-attr">"field_1"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"sbk2k1"</span>,
        <span class="hljs-attr">"github"</span>: <span class="hljs-string">"github.com/sbk2k1"</span>
    },
    <span class="hljs-attr">"query_parameters"</span>: {},
    <span class="hljs-attr">"bearer_token"</span>: <span class="hljs-string">"Bearer Token is absent"</span>,
    <span class="hljs-attr">"headers"</span>: {
        ...
    }
}
</code></pre>
<p>In a POST request, the body of the request contains the data that is being sent to the server. This data can be in various formats such as JSON, XML, or plain text. The purpose of the body is to provide additional information to the server beyond the URL and headers, which are used to route the request and provide metadata about the request, respectively. The body is particularly useful for sending data to the server for processing, such as creating or updating a resource, submitting a form, or uploading a file. You can see the body you sent through the request. In an actual working backend, the body can be saved to an external database.</p>
<hr />
<h2 id="heading-put-routes">/put Routes</h2>
<p>A PUT request is an HTTP method that is used to update an existing resource on the server. It is similar to the POST request, but it is used to modify an existing resource rather than create a new one.</p>
<p>The constraints of a PUT request are similar to that of a POST request. It requires that the client has the proper authorization to modify the resource, and it is idempotent, meaning that making multiple identical requests will have the same effect as a single request.</p>
<p>The body of a PUT request typically contains the updated representation of the resource being modified. This means that the client must send the entire representation of the resource, including any fields that have not changed.</p>
<p>One of the primary constraints of a PUT request is that it replaces the entire resource at the given URL. If the client only wants to update a specific field or subset of fields, a PATCH request should be used instead. Additionally, if the client is unsure if the resource already exists on the server, a POST request should be used instead of a PUT request.</p>
<p><strong>NOTE: The PATCH method is not inherently dangerous, but it can be risky if not used carefully. The main reason is that PATCH requests are designed to make partial updates to an existing resource, rather than replacing it entirely. If the PATCH request is not crafted correctly, it could potentially overwrite important data, leading to unintended consequences or security vulnerabilities. Additionally, since PATCH is a less commonly used HTTP method, some systems may not support it or may handle it differently, which could lead to compatibility issues. This blog does not include PATCH functionality.</strong></p>
<p>Change the request in Postman to a <strong>/put</strong> route and fire it off. It should look like this. We are using the same data as the post request.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678090631989/285c458f-dfa9-4257-bb00-b916e3d5ca0a.png" alt class="image--center mx-auto" /></p>
<p>The 200 OK response should look something like this.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"error"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Put Request Received Successfully"</span>,
    <span class="hljs-attr">"body"</span>: {
        <span class="hljs-attr">"field_1"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"sbk2k1"</span>,
        <span class="hljs-attr">"github"</span>: <span class="hljs-string">"github.com/sbk2k1"</span>
    },
    <span class="hljs-attr">"query_parameters"</span>: {},
    <span class="hljs-attr">"bearer_token"</span>: <span class="hljs-string">"Bearer Token is absent"</span>,
    <span class="hljs-attr">"headers"</span>: {
        ...
    }
}
</code></pre>
<hr />
<h2 id="heading-delete-routes">/delete Routes</h2>
<p>The DELETE HTTP method is used to delete a resource identified by a URI (Uniform Resource Identifier). It is used to remove a resource from the server. The DELETE method is idempotent, which means that making the same request multiple times will produce the same result as making the request only once.</p>
<p>In RESTful API design, the DELETE method typically does not have a body because the resource to be deleted is identified by the URI. The server deletes the resource identified by the URI, and the response status code indicates the success or failure of the operation.</p>
<p>To specify the resource to be deleted, the client includes the resource's identifier in the URI. For example, a DELETE request to <a target="_blank" href="https://example.com/api/users/123"><code>https://example.com/api/users/123</code></a> would delete the user with the ID of 123. If the resource to be deleted cannot be found, the server returns a 404 Not Found status code. (Status code refresher <a target="_blank" href="https://highonbugs.sbk2k1.me/rest-part-1">here</a>)</p>
<p>Load up the request as shown below.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678092386726/ac89195f-02ea-4d84-afe9-4b0d4af94ee1.png" alt class="image--center mx-auto" /></p>
<p>The 200 OK response is as follows.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"error"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"message"</span>: <span class="hljs-string">"Delete Request Received Successfully"</span>,
    <span class="hljs-attr">"query_parameters"</span>: {},
    <span class="hljs-attr">"bearer_token"</span>: <span class="hljs-string">"Bearer Token is absent"</span>,
    <span class="hljs-attr">"headers"</span>: {
        ...
    }
}
</code></pre>
<p><strong>NOTE: You can add a bearer token, query, or path parameters to any of the requests. I've simply skipped these for the brevity of this blog.</strong></p>
<p>In conclusion, the four HTTP methods, GET, POST, PUT, and DELETE, serve different purposes in REST API design. GET is used to retrieve resources, POST to create new resources, PUT to update or replace existing resources, and DELETE to remove resources. Each method has its constraints and use cases that are important to consider when designing a RESTful API. Understanding the differences between these methods is crucial for creating effective and efficient APIs.</p>
<hr />
<p>You may have understood the different technicalities, but this blog still does not highlight how these APIs are used in an actual application. Stay tuned for another series in which we will go through building an entire app and will show all implementations, code structure, and security/authentication measures commonly used. This should have provided you with a clear view of what REST APIs and REST are. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[REST and REST APIs - Part 2]]></title><description><![CDATA[In this blog, we will set up a REST API implemented using HTTP to play around and understand.
Resources

API endpoint: https://hob-api.vercel.app/

GitHub Repository: https://github.com/sbk2k1/API-Blog


Prerequisites
We will need some tools to get t...]]></description><link>https://highonbugs.sbk2k1.in/rest-part-2</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/rest-part-2</guid><category><![CDATA[REST API]]></category><category><![CDATA[REST]]></category><category><![CDATA[restful]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[Git]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sun, 05 Mar 2023 15:09:23 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1678008840695/f0add78b-1cec-4516-bc72-5758fb7cac59.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this blog, we will set up a REST API implemented using HTTP to play around and understand.</p>
<h3 id="heading-resources">Resources</h3>
<ol>
<li><p><strong>API endpoint:</strong> <a target="_blank" href="https://hob-api.vercel.app/">https://hob-api.vercel.app/</a></p>
</li>
<li><p><strong>GitHub Repository</strong>: <a target="_blank" href="https://github.com/sbk2k1/API-Blog">https://github.com/sbk2k1/API-Blog</a></p>
</li>
</ol>
<h3 id="heading-prerequisites">Prerequisites</h3>
<p>We will need some tools to get through this blog. They are:</p>
<ol>
<li><p><a target="_blank" href="https://highonbugs.sbk2k1.me/git-and-github">Git/GitHub</a> and <a target="_blank" href="https://nodejs.org/en/download/">npm</a>/<a target="_blank" href="https://classic.yarnpkg.com/lang/en/docs/install/#windows-stable">yarn</a>(Necessary only if the deployed endpoint is down)</p>
</li>
<li><p><a target="_blank" href="https://www.postman.com/">Postman</a> (We can use cURL but Postman will provide us with a cleaner UI)</p>
</li>
</ol>
<h3 id="heading-installation">Installation</h3>
<ul>
<li><p><strong>Git/GitHub</strong> (optional)- Check out my <a target="_blank" href="https://highonbugs.sbk2k1.me/git-and-github">blog</a>!</p>
</li>
<li><p><strong>npm</strong> and <strong>yarn</strong> (optional)<strong>-</strong> To install npm, follow the below steps:</p>
<ol>
<li><p>Download the Node.js installer from <a target="_blank" href="https://nodejs.org/en/download/"><strong>here</strong></a><strong>.</strong></p>
</li>
<li><p>Run the installer and follow the installation steps.</p>
</li>
<li><p>Once installed, open a command prompt or terminal and type <code>npm -v</code> to check the version of npm.</p>
</li>
</ol>
</li>
</ul>
<p>    To install yarn, follow the below steps:</p>
<ol>
<li><p>Download the Yarn installer from <a target="_blank" href="https://classic.yarnpkg.com/en/docs/install"><strong>here</strong></a><strong>.</strong></p>
</li>
<li><p>Run the installer and follow the installation steps.</p>
</li>
<li><p>Once installed, open a command prompt or terminal and type <code>yarn -v</code> to check the version of Yarn.</p>
</li>
</ol>
<p>    <strong>Note: Yarn can also be installed using npm by running the command</strong> <code>npm install -g yarn</code> <strong>in the command prompt or terminal.</strong></p>
<ul>
<li><p><strong>Postman:</strong> To install Postman, you can follow these steps:</p>
<ol>
<li><p>Go to the Postman <a target="_blank" href="https://www.postman.com/downloads/">website</a>.</p>
</li>
<li><p>Click the "Download" button for the version of Postman you want to install.</p>
</li>
<li><p>Follow the on-screen instructions to download the installation file.</p>
</li>
<li><p>Once the file is downloaded, open it and follow the instructions to install Postman.</p>
</li>
<li><p>Once the installation is complete, you can launch Postman by opening the application from your computer's application menu or by double-clicking on the Postman icon on your desktop (if you chose to create one during installation).</p>
</li>
</ol>
</li>
</ul>
<p>    That's it! You're now ready to use Postman to test and explore APIs.</p>
<h3 id="heading-setting-up-a-local-server-optional">Setting up a local server (optional)</h3>
<p>Follow the following steps to start up a server on your local system:</p>
<ol>
<li><p>Clone the repository:</p>
<pre><code class="lang-bash"> git <span class="hljs-built_in">clone</span> https://github.com/sbk2k1/API-Blog.git
</code></pre>
</li>
<li><p>Navigate to the directory:</p>
<pre><code class="lang-bash"> <span class="hljs-built_in">cd</span> API-Blog
</code></pre>
</li>
<li><p>Install dependencies using either npm or yarn:</p>
<pre><code class="lang-bash"> npm install
</code></pre>
<p> or</p>
<pre><code class="lang-bash"> yarn install
</code></pre>
</li>
<li><p>Remove the two forward slashes (<code>//</code>) from the beginning of the <code>app.listen</code> line to uncomment it.</p>
<pre><code class="lang-javascript"> app.listen(<span class="hljs-number">3000</span>, <span class="hljs-function">() =&gt;</span> {
   <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Server running on port 3000`</span>)
 });
</code></pre>
</li>
<li><p>Add two forward slashes (<code>//</code>) at the beginning of the <code>module.exports = app</code> the line to comment it out.</p>
<pre><code class="lang-javascript"> <span class="hljs-comment">// module.exports = app;</span>
</code></pre>
</li>
<li><p>Once the dependencies are installed, start the server:</p>
<pre><code class="lang-bash"> npm start
</code></pre>
<p> or</p>
<pre><code class="lang-bash"> yarn start
</code></pre>
<p> This will start the server on port 3000 by default.</p>
</li>
</ol>
<h3 id="heading-getting-it-to-work">Getting it to work</h3>
<p><strong>Note: I'm going to use the deployed endpoint for the blog, but you can use the</strong> <code>http://localhost:3000</code> <strong>if you have it set up in your local system.</strong></p>
<h2 id="heading-route">/ Route</h2>
<p>Create a new request in Postman as shown below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678027771128/b461c08f-ed22-492a-ab99-71d18f1b534c.png" alt class="image--center mx-auto" /></p>
<p>Fire it off and see what happens.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1678027804334/54c88d25-512a-4325-8523-841f823d3474.png" alt class="image--center mx-auto" /></p>
<p>This is the response.</p>
<p>This signifies that the server is working fine. Note that the status in the top right that says 200 OK. (Status code refresher <a target="_blank" href="https://highonbugs.sbk2k1.me/rest-part-1">here</a>)</p>
<p>You can also paste the link into your browser and see what happens. By default, when you enter a URL into a web browser and press enter, it sends a GET request to the server to retrieve the content of the specified resource.</p>
<p>We will look into each route and each little element in greater detail in the next blog. Stay tuned for part 3. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[REST and REST APIs - Part 1]]></title><description><![CDATA[This blog will talk about REST and RESTful APIs and will lead its way to the Backend Dev Blog.
So what is REST?
REST is an acronym for REpresentational State Transfer. It is a software architectural style or design pattern. An architectural pattern i...]]></description><link>https://highonbugs.sbk2k1.in/rest-part-1</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/rest-part-1</guid><category><![CDATA[REST API]]></category><category><![CDATA[REST]]></category><category><![CDATA[http]]></category><category><![CDATA[StateLESS]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Sat, 04 Mar 2023 17:36:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1677937542129/b5b22070-09d3-44e5-93a6-22290794415e.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This blog will talk about REST and RESTful APIs and will lead its way to the Backend Dev Blog.</p>
<h3 id="heading-so-what-is-rest">So what is REST?</h3>
<p><strong>REST</strong> is an acronym for <strong>RE</strong>presentational <strong>S</strong>tate <strong>T</strong>ransfer. It is a software architectural style or design pattern. An architectural pattern is a general, reusable solution to a commonly occurring problem in software architecture within a given context. Architectural patterns are often documented as software design patterns. <strong>REST</strong> was originally designed as a Web Architecture and its principles were presented by <strong>Roy Fielding</strong>, a computer scientist in his Ph.D. dissertation in 2000. <strong>REST</strong>-compliant systems, often called <strong>REST</strong>ful systems, are characterized by how they are stateless and separate the concerns of the client and server.</p>
<h3 id="heading-some-other-architectural-styles">Some other Architectural Styles.</h3>
<p>There are many recognized architectural styles and design patterns, among them:</p>
<ul>
<li><p>Blackboard</p>
</li>
<li><p>Client-serve<a target="_blank" href="https://en.wikipedia.org/wiki/Client%E2%80%93server_model">r</a> (2-tier, 3-tier, <em>n</em>-tier, cloud computing exhibit this style)</p>
</li>
<li><p>Component-based</p>
</li>
<li><p>Data-centric</p>
</li>
<li><p>Event-driven (or implicit invocation)</p>
</li>
<li><p>Layered (or multilayered architecture)</p>
</li>
<li><p>Microservices architecture</p>
</li>
<li><p>Monolithic application</p>
</li>
<li><p>Peer-to-peer (P2P)</p>
</li>
<li><p>Pipes and filters</p>
</li>
<li><p>Plug-ins</p>
</li>
<li><p>Reactive architecture</p>
</li>
<li><p>Representational state transfer (REST)</p>
</li>
<li><p>Rule-based</p>
</li>
<li><p>Service-oriented</p>
</li>
<li><p>Shared nothing architecture</p>
</li>
<li><p>Space-based architecture</p>
</li>
</ul>
<p><strong>Disclaimer: I don't know about all of these. These are just to show what REST is and what are some of its alternatives.</strong></p>
<h3 id="heading-principles-of-rest">Principles of REST</h3>
<ol>
<li><p><strong>Uniform Interface:</strong> The following four constraints can achieve a uniform REST interface:</p>
<ul>
<li><p><strong>Identification of resources</strong> – The interface must uniquely identify each resource involved in the interaction between the client and the server.</p>
</li>
<li><p><strong>Manipulation of resources through representations</strong> – The resources should have uniform representations in the server response.</p>
</li>
<li><p><strong>Self-descriptive messages</strong> – Each resource representation should carry enough information to describe how to process the message. It must clearly convey information on the operations that can be done on a resource</p>
</li>
<li><p><strong>Hypermedia as the engine of application state</strong> – The client should have only the initial URI of the application. The client application should dynamically drive all other resources and interactions with the use of hyperlinks.</p>
</li>
</ul>
</li>
<li><p><strong>Client and Server</strong>: The client is the entity that makes the demands and the server is the entity that handles the demands. The producer and consumer need to be separate for independent evolution. In the REST architectural style, the implementation of the client and the implementation of the server can be done independently without each knowing about the other. As long as each side knows what format of messages to send to the other, they can be kept modular and separate. Separating the user interface concerns from the data storage concerns, we improve the flexibility of the interface across platforms and improve scalability by simplifying the server components.</p>
</li>
<li><p><strong>Statelessness:</strong> Systems that follow the REST paradigm are stateless, meaning that the server does not need to know anything about what state the client is in and vice versa. In this way, both the server and the client can understand any message received, even without seeing previous messages. This constraint of statelessness is enforced through the use of <em>resources</em>, rather than <em>commands</em>. Resources are the nouns of the Web - they describe any object, document, or <em>thing</em> that you may need to store or send to other services. This also means all the information needed to carry out a request is present in the request itself.</p>
</li>
<li><p><strong>Cacheable:</strong> The cacheable constraint requires that a response should implicitly or explicitly label itself as cacheable or non-cacheable. If the response is cacheable, the client application gets the right to reuse the response data later for equivalent requests and a specified period.</p>
</li>
<li><p><strong>Layered System:</strong> The layered system style allows an architecture to be composed of hierarchical layers by constraining component behavior. For example, in a layered system, each component cannot see beyond the immediate layer they are interacting with.</p>
</li>
<li><p><strong>Code on Demand:</strong> This constraint is optional — an API can be RESTful even without providing code on demand. The client can request code from the server, and then the response from the server will contain some code, usually in the form of a script, when the response is in HTML format. The client then can execute that code.</p>
</li>
</ol>
<h3 id="heading-what-is-a-resource">What is a Resource?</h3>
<p>In <strong>REST</strong>, a resource is an object or piece of data that can be identified and manipulated using a unique identifier or <strong>URL</strong>.</p>
<h3 id="heading-what-are-resource-methods">What are Resource Methods?</h3>
<p>Resource methods are used to perform the desired transition between two states of any resource.</p>
<p><strong>NOTE: A large number of people wrongly relate resource methods to HTTP methods (i.e., GET/PUT/POST/DELETE). Roy Fielding has never mentioned any recommendation around which method to be used in which condition. All he emphasizes is that it should be a uniform interface.</strong></p>
<h3 id="heading-rest-and-http-are-not-the-same">REST and HTTP are Not the Same</h3>
<p>Many people prefer to compare HTTP with REST. <strong>REST and HTTP are not the same.</strong> HTTP is the underlying protocol used for communication between clients and servers on the World Wide Web. REST, on the other hand, is an architectural style that provides a set of guidelines and constraints for designing web services that are scalable, reliable, and easy to maintain. During his dissertation, Roy Fielding never mentioned any direct implementation. He never talked about HTTP or any such protocols we use today.</p>
<h3 id="heading-rest-apis">REST APIs</h3>
<p>This section is a leader to the Backend Development blog and will contain everything you need to know to get started with APIs.</p>
<h3 id="heading-what-is-an-api">What is an API?</h3>
<p>An API (Application Programming Interface) is a set of protocols, tools, and standards for building software applications. It defines how different software components should interact with each other, allowing developers to create software applications that can communicate and exchange data with other systems. APIs can be used to access data, services, or functionality provided by other software applications, and they are essential for building modern web and mobile applications. You can think of a web API as a gateway between clients and resources on the web.</p>
<p><strong>For example:</strong> Let's consider a restaurant. The customers who come in, and place the orders are the consumers. They are in this case known as the client. The producers are the chef and the kitchen crew, known here as the server (backend). The waiter is the entity that connects these two entities and ensures the smooth running of the restaurant (The software/app). Thus the waiter here is the API.</p>
<h3 id="heading-what-are-restful-apis">What are RESTful APIs?</h3>
<p>A RESTful API is a type of web API that follows the principles of the REST architectural style. It is designed to provide a standard way of accessing and manipulating resource states over the web using various methods.</p>
<h3 id="heading-why-restful-apis">Why RESTful APIs</h3>
<p>The benefits of using RESTful APIs include:</p>
<ol>
<li><p><strong>Scalability</strong>: RESTful APIs are designed to be scalable and can handle large volumes of requests and responses.</p>
</li>
<li><p><strong>Flexibility</strong>: RESTful APIs can support different types of clients (e.g., web browsers, mobile apps, IoT devices) and can be used with different programming languages and frameworks.</p>
</li>
<li><p><strong>Modularity</strong>: RESTful APIs are typically organized around resources, making them modular and easier to maintain.</p>
</li>
<li><p><strong>Caching</strong>: RESTful APIs support caching of responses, which can improve performance and reduce server load.</p>
</li>
<li><p><strong>Security</strong>: RESTful APIs can be secured using standard security protocols such as HTTPS and OAuth, making them more secure and less vulnerable to attacks.</p>
</li>
<li><p><strong>Ease of Use</strong>: RESTful APIs are easy to use and understand, which makes them more accessible to developers of all skill levels.</p>
</li>
</ol>
<p>Overall, RESTful APIs provide a standardized and scalable way to build modern web and mobile applications that can communicate and exchange data with other systems.</p>
<h3 id="heading-the-way-of-rest">The Way of REST</h3>
<p>The basic function of a RESTful API is the same as browsing the internet. The client contacts the server by using the API when it requires a resource. API developers explain how the client should use the REST API in the server application API documentation. These are the general steps for any REST API call:</p>
<p><img src="https://miro.medium.com/v2/resize:fit:875/0*HOi484uBZsxPQEKA.jpeg" alt /></p>
<ol>
<li><p>The client sends a request to the server. The client follows the API documentation to format the request in a way that the server understands.</p>
</li>
<li><p>The server authenticates the client and confirms that the client has the right to make that request.</p>
</li>
<li><p>The server receives the request and processes it internally.</p>
</li>
<li><p>The server returns a response to the client. The response contains information that tells the client whether the request was successful. The response also includes any information that the client requested.</p>
</li>
</ol>
<h3 id="heading-client-request">Client Request</h3>
<p>What does the RESTful API client request contain? The components are:</p>
<ol>
<li><p><strong>Unique resource identifier:</strong> The server identifies each resource with unique resource identifiers. For REST services, the server typically performs resource identification by using a Uniform Resource Locator (URL). The URL specifies the path to the resource. It's called an endpoint.</p>
</li>
<li><p><strong>Methods:</strong> Developers often implement RESTful APIs by using the Hypertext Transfer Protocol (HTTP). An HTTP method tells the server what it needs to do to the resource. The following are four common HTTP methods:</p>
<ol>
<li><p><strong><em>GET:</em></strong> Clients use GET to access resources that are located at the specified URL on the server. They can cache GET requests and send parameters in the RESTful API request to instruct the server to filter data before sending.</p>
</li>
<li><p><strong><em>POST:</em></strong> Clients use POST to send data to the server. They include the data representation with the request. Sending the same POST request multiple times has the side effect of creating the same resource multiple times.</p>
</li>
<li><p><strong><em>PUT:</em></strong> Clients use PUT to update existing resources on the server. Unlike POST, sending the same PUT request multiple times in a RESTful web service gives the same result.</p>
</li>
<li><p><strong><em>DELETE:</em></strong> Clients use the DELETE request to remove the resource. A DELETE request can change the server state. However, if the user does not have appropriate authentication, the request fails.</p>
</li>
</ol>
</li>
<li><p><strong>HTTP headers:</strong> Request headers are the metadata exchanged between the client and server. For instance, the request header indicates the format of the request and response, provides information about the request status, and so on. It also has fields for authentication.</p>
</li>
<li><p><strong>Parameters:</strong> RESTful API requests can include parameters that give the server more details about what needs to be done. The following are some different types of parameters:</p>
<ul>
<li><p>Path parameters that specify URL details.</p>
</li>
<li><p>Query parameters that request more information about the resource.</p>
</li>
<li><p>Cookie parameters that authenticate clients quickly.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-server-response">Server Response</h3>
<p>What does the RESTful server response request contain? The components are:</p>
<ol>
<li><p><strong>Status line:</strong> The status line contains a three-digit status code that communicates request success or failure. For instance, 2XX codes indicate success, but 4XX and 5XX codes indicate errors. 3XX codes indicate URL redirection.</p>
<p> The following are some common status codes:</p>
<ul>
<li><p>200: Generic success response</p>
</li>
<li><p>201: POST method success response</p>
</li>
<li><p>400: Incorrect request that the server cannot process</p>
</li>
<li><p>404: Resource not found</p>
</li>
</ul>
</li>
<li><p><strong>Message body:</strong> The response body contains the resource representation. The server selects an appropriate representation format based on what the request headers contain. Clients can request information in XML or JSON formats, which define how the data is written in plain text.</p>
</li>
<li><p><strong>Headers:</strong> The response also contains headers or metadata about the response. They give more context about the response and include information such as the server, encoding, date, and content type.</p>
</li>
</ol>
<h3 id="heading-common-response-status-codes">Common Response Status Codes</h3>
<p><img src="https://www.infidigit.com/wp-content/uploads/2019/12/20191227_012601_0000.png" alt="What Are Http Status Codes | Full List Of Http Status Codes - Infidigit" /></p>
<hr />
<p>We can get a clear view of REST APIs in the next blog REST and REST APIs - Part 2. This should have provided you with a clear view of what REST APIs and REST are. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[How to contribute using Git and GitHub]]></title><description><![CDATA[This blog will be a step-by-step guide to my Git and GitHub workshop, for which all the resources can be found here. All code snippets will be provided in the different readme files. You can send PRs (even dummy ones), and create issues and I'll merg...]]></description><link>https://highonbugs.sbk2k1.in/git-and-github</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/git-and-github</guid><category><![CDATA[Git]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[Gitcommands]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[version control]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Thu, 16 Feb 2023 15:32:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1676561117015/6b92903d-acef-4600-9d08-ebfb85293de6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This blog will be a step-by-step guide to my Git and GitHub workshop, for which all the resources can be found <a target="_blank" href="https://github.com/High-on-Bugs/Git-GitHub-Workshop">here</a>. All code snippets will be provided in the different readme files. You can send PRs (even dummy ones), and create issues and I'll merge them. (probably)</p>
<h3 id="heading-what-are-version-control-systems">What are Version Control Systems?</h3>
<p>Version control systems are a category of software tools that helps in recording changes made to files by keeping a track of modifications done in the code.</p>
<h3 id="heading-why-do-we-need-version-control-systems">Why do we need Version Control Systems?</h3>
<p>Software projects are undertaken by multiple developers with different areas of specialty. They may be present in different locations, working at different times and on different functionalities/features. A version control system is a kind of software that helps the developer team to efficiently communicate and manage(track) all the changes that have been made to the source code along with information like who made and what changes have been made.</p>
<h3 id="heading-some-of-the-benefits-of-version-control-systems-are">Some of the benefits of Version Control Systems are:</h3>
<ul>
<li><p>Enhances the project development speed by coordinating efforts between developers.</p>
</li>
<li><p>Enhances communication and productivity.</p>
</li>
<li><p>Provides a robust system to track changes and point out errors.</p>
</li>
<li><p>Effective for promoting remote work.</p>
</li>
<li><p>A fatal error can be easily fixed by rolling back to a previous version</p>
</li>
<li><p>Helps in recovery in case of any disaster or contingent situation,</p>
</li>
<li><p>Informs us about Who, What, When, and Why changes have been made.</p>
</li>
</ul>
<h3 id="heading-what-is-git">What is Git?</h3>
<p>Git is a <a target="_blank" href="https://git-scm.com/about/free-and-open-source">free and open-source</a> distributed version control system designed to handle everything from small to very large projects with speed and efficiency.</p>
<p>It is an open-source project developed originally by Linus Torvalds in 2005 while creating the Linux Operating System Kernel.</p>
<p><img alt class="image--center mx-auto" /></p>
<p>It is used for tracking changes in any set of files, usually used for coordinating work among developers collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows.</p>
<h3 id="heading-what-is-github">What is GitHub?</h3>
<p>GitHub is a Microsoft-owned company that provides Internet hosting services for software development and version control using Git. It makes it a lot easier for developers to work and collaborate on projects together.</p>
<p><img src="https://miro.medium.com/max/1400/1*SSRjtoQ0H2X3SBPOiJ5rZw.jpeg" alt class="image--center mx-auto" /></p>
<p>GitHub provides a very user-friendly interface that initiates even beginners to version control. GitHub also works on promoting open-source development, community learning, and all other good stuff through different programs, but more on that later. (hmu if you want to know!)</p>
<h3 id="heading-git-vs-github">Git vs GitHub</h3>
<p>People get confused about the difference between Git and GitHub. What do they actually do? Which one to learn? (I was confused too)</p>
<p><img src="https://devmountain.com/wp-content/uploads/2022/01/Gitvs_Github-1a-1.jpg" alt class="image--center mx-auto" /></p>
<p>So Git is the application installed on the local computer that lets you manage and track source code, while GitHub provides a lot more features! It provides only repository hosting services and lets you work on most of the Git features through an easy-to-use web UI.</p>
<p>So finally (keywords)</p>
<ul>
<li><p>Git - On your local machine. Uses CLI. Track code and files. Helps create branches, unlike a lot of other VCSs. Made by Linus Torvalds. Open Source.</p>
</li>
<li><p>GitHub - Git repository hosting service. Cloud-based. Easy and cool UI. Can share with others. Visualize workflows. And a lot more!</p>
</li>
</ul>
<hr />
<h1 id="heading-lets-git-it">Let's Git it!</h1>
<h3 id="heading-requirements">Requirements</h3>
<p>We'll need two things to start version controlling right away:</p>
<ul>
<li><p><strong>Git installed on our Local System:</strong> Head over to the <a target="_blank" href="https://git-scm.com/downloads">Git Downloads Website</a>. Download Git for your system (Windows/Mac). Click through the installation process and you're good.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676554507025/6f403b20-bf12-4064-b93e-bf453f71a446.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>Create a GitHub account:</strong> Head over to the <a target="_blank" href="https://github.com/signup?ref_cta=Sign+up&amp;ref_loc=header+logged+out&amp;ref_page=%2F&amp;source=header-home">GitHub Website</a> and sign up to create your account!</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676554583965/dfcad198-1ace-47ba-9695-fb91286a22d4.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<h3 id="heading-configuration">Configuration</h3>
<p>We need to configure Git for ourselves using the following commands in a Terminal or Command Prompt.</p>
<pre><code class="lang-bash">git config --global user.email <span class="hljs-string">"&lt;your_email&gt;@example.com"</span>
git config --global user.name <span class="hljs-string">"&lt;your_password&gt;</span>
</code></pre>
<hr />
<h1 id="heading-the-workflows">The Workflows</h1>
<p>In this blog, I'll go through the entirety of two workflows.</p>
<ul>
<li><p>Create and work on your own repository.</p>
</li>
<li><p>Contribute to an open-source repository.</p>
</li>
</ul>
<p>We need to understand some terms before that.</p>
<h3 id="heading-terminology">Terminology</h3>
<ul>
<li><p><strong>Open source</strong> - A software development philosophy that emphasizes transparency, collaboration, and community-driven innovation. Open-source projects make their source code publicly available for others to use, modify, and distribute freely.</p>
</li>
<li><p><strong>Repository</strong> - A central location in Git where all the project's files and version history are stored. Developers can make changes to files and commit those changes to the repository.</p>
</li>
<li><p><strong>Branch</strong> - A copy of the repository that allows developers to work on new features or bug fixes without affecting the main codebase. Once the changes are complete, they can be merged back into the main branch.</p>
</li>
<li><p><strong>Pull request</strong> - A request to merge changes made in a branch into the main codebase. Other developers can review the changes and provide feedback before the changes are merged.</p>
</li>
<li><p><strong>Fork</strong> - A copy of a repository that allows developers to make changes without affecting the original codebase. Forks are often used in open-source projects to contribute changes back to the original project.</p>
</li>
<li><p><strong>Issue</strong> - A problem or task that needs to be addressed in a project. Issues can be used to track bugs, feature requests, or other tasks.</p>
</li>
<li><p><strong>Commit</strong> - A snapshot of changes made to the codebase. Commits include a message describing the changes made and who made them.</p>
</li>
<li><p><strong>Merge</strong> - The process of combining changes from one branch or repository into another. Merges can be used to integrate changes made in a fork back into the original project.</p>
</li>
<li><p><strong>Clone</strong> - A copy of a repository that is stored on a local machine. Cloning a repository allows developers to work on the code without being connected to the internet.</p>
</li>
<li><p><strong>Pull</strong> - The process of downloading changes made to a remote repository to a local machine. Pulling updates from a remote repository ensures that a local repository is up-to-date with the latest changes made by other developers.</p>
</li>
<li><p><strong>Push</strong> - The process of uploading changes made from a local repository to a remote repository. Pushing changes allows other developers to see the changes and collaborate on the project.</p>
</li>
<li><p><strong>gitignore</strong>- A file in a repository that specifies which files or directories should be excluded from version control. Files or directories listed in the .gitignore file will not be tracked by Git.</p>
</li>
<li><p><strong>License</strong> - A legal agreement that defines how an open-source project can be used and distributed. Open source licenses typically allow others to use and modify the code but may require attribution or impose other conditions.</p>
</li>
</ul>
<hr />
<h2 id="heading-workflow-1-personal-project">Workflow #1: Personal Project</h2>
<p>To convert a directory to a git repository we need to use the <code>git init</code> command</p>
<pre><code class="lang-bash">git init
</code></pre>
<p>Let's open up a terminal in a certain directory and enter the command</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676556566941/a99ceb54-afad-4f89-9ee1-e36f083afd9b.png" alt class="image--center mx-auto" /></p>
<p>Great! The directory is now a git repository and git is now tracking any changes done inside.</p>
<p>Let us then create two markdown files to store the commands we used until now.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676556850046/c2057918-2dde-4ed2-9078-3eb3f7debdb0.png" alt class="image--center mx-auto" /></p>
<p>Inside the terminal let us type the <code>git status</code> command to see which files are added to the staging area.</p>
<pre><code class="lang-bash">git status
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676557044371/cbd4be32-3104-4c08-91c2-1bff9fb096d7.png" alt class="image--center mx-auto" /></p>
<p>We can add these files to the staging area using the <code>git add &lt;filename&gt;</code> command. We can use <code>.</code> instead of <code>&lt;filename&gt;</code> to signify that we want to add all the files in the repository to the staging area.</p>
<pre><code class="lang-bash">git add .
</code></pre>
<p>Let us type <code>git status</code> once again to check if the files are added to the staging area.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676557267428/2bdd0bc9-9eb2-4e70-bea8-e56ef2186a26.png" alt class="image--center mx-auto" /></p>
<p>Great! The files are added to the staging area and are ready to be committed. We will commit the staged files using the <code>git commit</code> command</p>
<pre><code class="lang-bash">git commit -m <span class="hljs-string">"Commit Message"</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676557347199/c02badef-3055-469d-b400-6685af282a54.png" alt class="image--center mx-auto" /></p>
<p>Now the files are committed and git has stored a snapshot of the files. You can roll back to this commit if any errors pop up on further development.</p>
<p>We can now create a new repository on GitHub, by choosing all the required options.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676557674863/808f1455-b234-4af9-bec1-d8d2bdce450c.png" alt class="image--center mx-auto" /></p>
<p>After the empty GitHub repo is created, follow the second set of instructions, “Push an existing repository…”</p>
<pre><code class="lang-bash">git remote add origin https://github.com/&lt;your_username&gt;/learn-git
git push -u origin master
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1676558673223/676f804e-3d0a-4362-9540-a4c8b907a833.png" alt class="image--center mx-auto" /></p>
<p>Now your code is hosted publicly and visible to everyone!</p>
<h3 id="heading-stages-in-git">Stages in Git</h3>
<p><img src="https://miro.medium.com/max/875/1*ogHlBWY-Bn24FfsQcpeSiw.png" alt /></p>
<p>Let us know visualize what happens when you use the add and commit messages.</p>
<p><img src="https://git-scm.com/book/en/v2/images/lifecycle.png" alt="Git - Recording Changes to the Repository" /></p>
<p>This was the entire create and host your personal project workflow. You now have an understanding of the entire process and can create your own repositories on GitHub. You can find a definition of commands used in this section down below.</p>
<h3 id="heading-commands">Commands</h3>
<ol>
<li><p><code>git init</code>: Initializes a new Git repository in the current working directory. This creates a new <code>.git</code> directory that contains all the necessary files and subdirectories for Git to track changes in your project.</p>
</li>
<li><p><code>git add</code>: Adds changes to the staging area, which is a temporary storage area for changes before they are committed. You can use this command to add specific files or directories, or you can use <code>git add .</code> it to add all changes in the current directory.</p>
</li>
<li><p><code>git status</code>: Displays the current status of the working directory, including any changes that have been made but not yet staged, any changes that have been staged but not committed, and any untracked files. This command is useful for keeping track of what changes have been made and what still needs to be done.</p>
</li>
<li><p><code>git commit</code>: Commits the changes in the staging area to the local Git repository. This creates a new commit with a unique identifier, a commit message, and a snapshot of the changes that were added to the staging area.</p>
</li>
<li><p><code>git remote set origin</code>: Sets the remote repository that Git will use for pushing and pulling changes. This command sets the URL of the remote repository and gives it the name "origin", which is the default name for the primary remote repository.</p>
</li>
<li><p><code>git push</code>: Pushes the committed changes to the remote repository. This sends the changes to the remote repository and updates the branch that you are working on. You can specify the remote repository and branch using the command <code>git push &lt;remote&gt; &lt;branch&gt;</code>.</p>
</li>
</ol>
<hr />
<h2 id="heading-workflow-2-contributing">Workflow #2: Contributing</h2>
<p>This is the one you are mostly going to use when you are trying to contribute to an open-source repository. Contribution typically involves the following steps:</p>
<ol>
<li><p><strong>Fork the repository:</strong> Forking creates a copy of the original repository under your own account, which you can work on independently. To do this, navigate to the repository's GitHub page and click the "Fork" button in the upper right corner.</p>
</li>
<li><p><strong>Clone the fork:</strong> Once you've forked the repository, you'll want to clone it to your local machine so you can make changes to it. To do this, run the following command in your terminal, replacing <code>your-username</code> with your GitHub username and <code>repository-name</code> with the name of the repository you forked:</p>
<pre><code class="lang-bash"> git <span class="hljs-built_in">clone</span> git@github.com:your-username/repository-name.git
</code></pre>
</li>
<li><p><strong>Create a new branch:</strong> It's generally a good idea to create a new branch for each set of changes you make. To do this, run the following command, replacing <code>new-branch-name</code> with a descriptive name for your new branch:</p>
<pre><code class="lang-bash"> git checkout -b new-branch-name
</code></pre>
<p> <img src="https://miro.medium.com/max/875/1*XJmL7fWV4Coxt12AzHeG-A.png" alt /></p>
<p> ⬆️ Visualization of branches in Git</p>
</li>
<li><p><strong>Make changes:</strong> Now that you have the repository cloned and a new branch checked out, you can make changes to the code. Use your preferred text editor or IDE to edit the files.</p>
</li>
<li><p><strong>Commit your changes:</strong> Once you've made the changes, you'll want to commit them to your local repository. To commit, refer to the personal workflow section of the blog.</p>
<pre><code class="lang-bash"> git commit -m <span class="hljs-string">"commit-message"</span>
</code></pre>
</li>
<li><p><strong>Push the changes:</strong> After committing your changes, you'll want to push them to your forked repository on GitHub. To do this, run the following command, replacing <code>new-branch-name</code> with the name of the branch, you created:</p>
<pre><code class="lang-bash"> git push -u origin new-branch-name
</code></pre>
</li>
<li><p><strong>Create a pull request:</strong> Once you've pushed your changes to your forked repository, you can create a pull request to merge your changes into the original repository. To do this, navigate to the original repository's GitHub page and click the "New pull request" button. Select your fork and the branch you just pushed, and provide a description of your changes.</p>
</li>
<li><p><strong>Respond to feedback:</strong> The maintainers of the original repository may request changes or ask questions about your pull request. Be sure to respond in a timely manner and make any requested changes.</p>
</li>
<li><p><strong>Merge your changes:</strong> If the maintainers approve your pull request, they will merge your changes into the original repository. Congratulations, you've successfully contributed to an open-source project!</p>
</li>
</ol>
<p>Note that some repositories may have slightly different workflows or conventions, so be sure to check the project's documentation or ask the maintainers if you're unsure.</p>
<p>From the above process, your code is still not in the master branch yet right? Teams usually establish a minimum amount of reviews to get a pull request merged. A reviewer might ask for code changes and, better documentation or anything else. Once you get enough numbers of eyes on your work, they can merge it! You can also send a PR to the master branch directly. Please refer to the documentation of the repository and check for guidelines for contribution.</p>
<h3 id="heading-commands-1">Commands</h3>
<ol>
<li><p><code>git checkout</code>: This command allows you to switch between different branches or versions of your code. When you run <code>git checkout</code> followed by a branch name, Git will replace the contents of your working directory with the version of the code stored in that branch. This is useful when you want to work on a different version of the code, or when you want to create a new branch to work on.</p>
</li>
<li><p><code>git branch</code>: This command allows you to create, list, and delete branches in your Git repository. When you create a new branch, you create a separate version of the code that can be modified independently of the main branch. This allows you to experiment with changes without affecting the main codebase. You can also use <code>git branch</code> to see a list of all branches in your repository and to switch between them using <code>git checkout</code>.</p>
</li>
<li><p><code>git merge</code>: This command allows you to combine changes from one branch into another. When you run <code>git merge</code> followed by the name of the branch you want to merge, Git will apply the changes made in that branch to the current branch. This is useful when you want to incorporate changes from a feature branch into the main codebase, or when you want to bring a forked repository up to date with the original repository.</p>
</li>
</ol>
<hr />
<h3 id="heading-fun-fact"><strong>Fun Fact :</strong></h3>
<p><code>gitignore</code> is a file that specifies files or directories that Git should ignore when tracking changes in a repository. This is useful for files that are generated during the development process, such as log files or temporary files, or for sensitive information that should not be committed to the repository, such as API keys or passwords. By listing these files or directories in a <code>.gitignore</code> file, you can ensure that they are not accidentally committed to the repository.</p>
<p><code>gitkeep</code>, on the other hand, is a file that is used to ensure that an otherwise empty directory is included in a Git repository. Git does not track empty directories, so if you want to include an empty directory in your repository, you can add a <code>.gitkeep</code> file to that directory. This file can be empty, but its presence will ensure that Git tracks the directory and includes it in the repository.</p>
<hr />
<p>Now you know the two main workflows used by some of the biggest organizations in the world. With these workflows, we can ensure proper integration and collaboration of work where negative collisions are avoided in a workspace of multiple developers.</p>
<p>The knowledge of these tools will definitely help you build and dish out better software and open you up to a thriving community of developers working towards common goals. Let's Git it. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/"><strong>LinkedIn</strong></a> / <a target="_blank" href="https://github.com/sbk2k1"><strong>GitHub</strong></a> / <a target="_blank" href="https://twitter.com/sbk_2k1"><strong>Twitter</strong></a>.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title><![CDATA[My GitHub Campus Expert 🚩 Application Process [SELECTED]]]></title><description><![CDATA[Hello everyone, hope you're doing fine. I've been DM'ed a lot over LinkedIn and a lot of other Social Media Platforms regarding queries about my GitHub Campus Expert Journey and also about the selection process. This blog is an attempt to answer all ...]]></description><link>https://highonbugs.sbk2k1.in/my-github-campus-expert-application-process-selected</link><guid isPermaLink="true">https://highonbugs.sbk2k1.in/my-github-campus-expert-application-process-selected</guid><category><![CDATA[GitHub]]></category><category><![CDATA[github campus experts]]></category><category><![CDATA[training]]></category><category><![CDATA[gce]]></category><dc:creator><![CDATA[Saptarshi Bhattacharya]]></dc:creator><pubDate>Tue, 14 Feb 2023 07:36:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1676360759225/903e24cc-b92c-4371-b00f-1c81e5032b82.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello everyone, hope you're doing fine. I've been DM'ed a lot over LinkedIn and a lot of other Social Media Platforms regarding queries about my GitHub Campus Expert Journey and also about the selection process. This blog is an attempt to answer all those questions and help understand what are the necessary steps to follow to become a GitHub Campus Expert 🚩.</p>
<h3 id="heading-how-did-i-come-across-the-github-campus-expert-program">How did I come across the GitHub Campus Expert 🚩Program?</h3>
<p>I had already got my hands on the <a target="_blank" href="https://education.github.com/discount_requests/pack_application">GitHub Student Developers Pack</a> which provides you with access to amazing developer tools worth a...lot of money. Probably 5 figures. USD. Probably even more. You can learn and improve your skills using the Developers Pack. (Check it out if you haven't).</p>
<p>Anyways, I was browsing through the <a target="_blank" href="https://education.github.com/">GitHub Education</a> website and discovered the GitHub Campus Expert 🚩 Program. After some reading and a quick YouTubing session, I clicked on the <em>Become a Campus Expert button.</em> (I was a bit early and asked them to notify me when the selection process started, but we'll just skip through that. :D)</p>
<h3 id="heading-so-what-is-the-github-campus-expert-program">So what is the GitHub Campus Expert 🚩Program?</h3>
<p>According to GitHub- "Campus Experts are student leaders that strive to build diverse and inclusive spaces to learn skills, share their experiences, and build projects together. They can be found across the globe leading in-person and online conferences, meetups, and hackathons, and maintaining open source projects."</p>
<p>So as a Campus Expert 🚩, GitHub provides you with resources that help to grow your local community from scratch. It helps you in organizing events and doing everything else to engage and nurture your community. (You also get a lot of personal opportunities and networking opportunities!) Although GitHub Campus Experts 🚩 are not your average Campus Ambassadors, they are not GitHub Employees either. We represent and spread the boon of Git and GitHub and try to help grow local communities.</p>
<h3 id="heading-eligibility">Eligibility</h3>
<p>To apply for the program, you must:</p>
<ul>
<li><p>Be 18 years of age or older</p>
</li>
<li><p>Have had a <a target="_blank" href="https://github.com/"><strong>GitHub Account</strong></a> that's at least 6 months old.</p>
</li>
<li><p>Have the <a target="_blank" href="https://education.github.com/discount_requests/student_application?utm_source=2022-04-08-hack4bengal"><strong>GitHub Student Developer Pack</strong></a></p>
</li>
<li><p>Be enrolled in a formal higher education institution</p>
</li>
<li><p>Have at least one year before graduating</p>
</li>
</ul>
<h3 id="heading-step-1-get-the-pack">Step 1: Get the Pack</h3>
<p>That's right. You need to get your hands on the <a target="_blank" href="https://education.github.com/discount_requests/pack_application">GitHub Student Developers Pack</a> to be eligible for the application to the program. You need to verify that you are a student, by uploading any Institute ID or your Institute-issued student email ID. Once you've been verified you can proceed to the next step which is filling up the initial application. This part is pretty self-explanatory and should be easy. (hmu if you still get stuck)</p>
<h3 id="heading-step-2-the-form">Step 2: The Form</h3>
<p>Complete the application form with all the essays. Remember plagiarism is a sin and the system will auto-reject your application if it senses any form of plagiarism. The essays and the entire process checks for these things:</p>
<ul>
<li><p><strong>Potential:</strong> What do you want to do? What do you want to learn?</p>
</li>
<li><p><strong>Motivation:</strong> Why do you want to do what you want to do?</p>
</li>
<li><p><strong>Interest:</strong> Why do you want to be part of the program? What have to done to ensure success thus far?</p>
</li>
<li><p><strong>Contribution:</strong> What will you be able to do once you're in? What have you already done?</p>
</li>
</ul>
<p>Try to include these details in your essays:</p>
<ul>
<li><p>Trace out your community. What kind of people does your community comprise?</p>
</li>
<li><p>Talk about the problems you have suffered building/working with your community. How have you tackled it (if you have) and how does it hamper people from learning/achieving goals?</p>
</li>
<li><p>Explain what the status quo is in your community.</p>
</li>
<li><p>Where will the GitHub Campus Expert 🚩Program come into the picture?</p>
</li>
<li><p>Talk about your values. Inclusivity, Diversity, making people feel safe and fostering a learning environment.</p>
</li>
<li><p>Talk about your goals.</p>
</li>
</ul>
<p>Check out <a target="_blank" href="https://github.blog/2020-12-10-introducing-the-new-and-improved-campus-experts-program/">this blog</a> for additional reference. (It's from GitHub)</p>
<p>Also maybe <a target="_blank" href="https://dev.to/induja/becoming-a-github-campus-expert-3pe0">this one</a></p>
<h3 id="heading-step-3-the-video">Step 3: The Video</h3>
<p>You will be notified via email at the end of the review if GitHub would like to move forward with your application. You'll then be asked to submit a video resume - a simple video of you talking about yourself, your community, and your visions for the community.</p>
<p>This helps the GitHub team get a better understanding and get to know you closely. So be confident and free in the way you speak. You'll have to submit the video resume in 2 weeks' time. It takes about a week to review the videos. You'll be notified via email if you make it through.</p>
<p>Reference video I used: ( Thank you <a class="user-mention" href="https://hashnode.com/@dwvicy">Vaishnavi Dwivedi</a> ! )</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=wu-lQfoS6A0">https://www.youtube.com/watch?v=wu-lQfoS6A0</a></div>
<p> </p>
<h3 id="heading-step-4-training">Step 4: Training</h3>
<p>If your submission was approved, Congratulations! You’ve been accepted to the program. You’ll go through the GitHub Campus Experts 🚩 Training. The training has six modules and takes 12 hours to complete in a span of 6 weeks.</p>
<p>The training will have 6 modules and takes 12 hours to complete in 6 weeks. Here you’ll be able to analyze your community and learn community leadership skills like Inclusivity, Information Design, Public Speaking, Communities, and Software Dev skills. At the end of your training, you’ll submit a community proposal that will serve as a guideline for your community and you’ll become a GitHub Campus Expert 🚩.</p>
<h3 id="heading-faq">F.A.Q.</h3>
<p>Q: Can you please share your training?</p>
<p>A: No the training modules are designed to help you learn and look out for things that are needed to become an effective Community Lead/ Expert. You should come up with answers yourself.</p>
<p>Q: Can you review my essays?</p>
<p>A: I'm sorry but I won't be able to help every one of you with your essays. But I can give you a few pointers. Try to be yourself. Reflect upon what you know about your community, and what you can and will do. Your intentions should speak. Do not try to plagiarize from others.</p>
<p>Q: My application got rejected. Can I reapply?</p>
<p>A: Yes absolutely, since the program looks for newer Campus Experts 🚩 every 6 months, you can go for it in the next semester.</p>
<p>I'll keep on adding the FAQs as I get more questions!</p>
<h3 id="heading-thank-you">Thank You!</h3>
<p>Thank you for being with me till here! I tried to cover everything that would be necessary to know about the process. Good Luck with your application. If you still have any questions/queries you can reach out to me on my <a target="_blank" href="https://www.linkedin.com/in/sbk2k1/">LinkedIn</a> / <a target="_blank" href="https://github.com/sbk2k1">GitHub</a> / <a target="_blank" href="https://twitter.com/sbk_2k1">Twitter</a>.</p>
<p>Cheers!</p>
]]></content:encoded></item></channel></rss>