How to contribute using Git and GitHub

How to contribute using Git and GitHub

This blog will be a step-by-step guide to my Git and GitHub workshop, for which all the resources can be found here. All code snippets will be provided in the different readme files. You can send PRs (even dummy ones), and create issues and I'll merge them. (probably)

What are Version Control Systems?

Version control systems are a category of software tools that helps in recording changes made to files by keeping a track of modifications done in the code.

Why do we need Version Control Systems?

Software projects are undertaken by multiple developers with different areas of specialty. They may be present in different locations, working at different times and on different functionalities/features. A version control system is a kind of software that helps the developer team to efficiently communicate and manage(track) all the changes that have been made to the source code along with information like who made and what changes have been made.

Some of the benefits of Version Control Systems are:

  • Enhances the project development speed by coordinating efforts between developers.

  • Enhances communication and productivity.

  • Provides a robust system to track changes and point out errors.

  • Effective for promoting remote work.

  • A fatal error can be easily fixed by rolling back to a previous version

  • Helps in recovery in case of any disaster or contingent situation,

  • Informs us about Who, What, When, and Why changes have been made.

What is Git?

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

It is an open-source project developed originally by Linus Torvalds in 2005 while creating the Linux Operating System Kernel.

It is used for tracking changes in any set of files, usually used for coordinating work among developers collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

What is GitHub?

GitHub is a Microsoft-owned company that provides Internet hosting services for software development and version control using Git. It makes it a lot easier for developers to work and collaborate on projects together.

GitHub provides a very user-friendly interface that initiates even beginners to version control. GitHub also works on promoting open-source development, community learning, and all other good stuff through different programs, but more on that later. (hmu if you want to know!)

Git vs GitHub

People get confused about the difference between Git and GitHub. What do they actually do? Which one to learn? (I was confused too)

So Git is the application installed on the local computer that lets you manage and track source code, while GitHub provides a lot more features! It provides only repository hosting services and lets you work on most of the Git features through an easy-to-use web UI.

So finally (keywords)

  • Git - On your local machine. Uses CLI. Track code and files. Helps create branches, unlike a lot of other VCSs. Made by Linus Torvalds. Open Source.

  • GitHub - Git repository hosting service. Cloud-based. Easy and cool UI. Can share with others. Visualize workflows. And a lot more!


Let's Git it!

Requirements

We'll need two things to start version controlling right away:

  • Git installed on our Local System: Head over to the Git Downloads Website. Download Git for your system (Windows/Mac). Click through the installation process and you're good.

  • Create a GitHub account: Head over to the GitHub Website and sign up to create your account!

Configuration

We need to configure Git for ourselves using the following commands in a Terminal or Command Prompt.

git config --global user.email "<your_email>@example.com"
git config --global user.name "<your_password>

The Workflows

In this blog, I'll go through the entirety of two workflows.

  • Create and work on your own repository.

  • Contribute to an open-source repository.

We need to understand some terms before that.

Terminology

  • Open source - A software development philosophy that emphasizes transparency, collaboration, and community-driven innovation. Open-source projects make their source code publicly available for others to use, modify, and distribute freely.

  • Repository - A central location in Git where all the project's files and version history are stored. Developers can make changes to files and commit those changes to the repository.

  • Branch - A copy of the repository that allows developers to work on new features or bug fixes without affecting the main codebase. Once the changes are complete, they can be merged back into the main branch.

  • Pull request - A request to merge changes made in a branch into the main codebase. Other developers can review the changes and provide feedback before the changes are merged.

  • Fork - A copy of a repository that allows developers to make changes without affecting the original codebase. Forks are often used in open-source projects to contribute changes back to the original project.

  • Issue - A problem or task that needs to be addressed in a project. Issues can be used to track bugs, feature requests, or other tasks.

  • Commit - A snapshot of changes made to the codebase. Commits include a message describing the changes made and who made them.

  • Merge - The process of combining changes from one branch or repository into another. Merges can be used to integrate changes made in a fork back into the original project.

  • Clone - A copy of a repository that is stored on a local machine. Cloning a repository allows developers to work on the code without being connected to the internet.

  • Pull - The process of downloading changes made to a remote repository to a local machine. Pulling updates from a remote repository ensures that a local repository is up-to-date with the latest changes made by other developers.

  • Push - The process of uploading changes made from a local repository to a remote repository. Pushing changes allows other developers to see the changes and collaborate on the project.

  • gitignore- A file in a repository that specifies which files or directories should be excluded from version control. Files or directories listed in the .gitignore file will not be tracked by Git.

  • License - A legal agreement that defines how an open-source project can be used and distributed. Open source licenses typically allow others to use and modify the code but may require attribution or impose other conditions.


Workflow #1: Personal Project

To convert a directory to a git repository we need to use the git init command

git init

Let's open up a terminal in a certain directory and enter the command

Great! The directory is now a git repository and git is now tracking any changes done inside.

Let us then create two markdown files to store the commands we used until now.

Inside the terminal let us type the git status command to see which files are added to the staging area.

git status

We can add these files to the staging area using the git add <filename> command. We can use . instead of <filename> to signify that we want to add all the files in the repository to the staging area.

git add .

Let us type git status once again to check if the files are added to the staging area.

Great! The files are added to the staging area and are ready to be committed. We will commit the staged files using the git commit command

git commit -m "Commit Message"

Now the files are committed and git has stored a snapshot of the files. You can roll back to this commit if any errors pop up on further development.

We can now create a new repository on GitHub, by choosing all the required options.

After the empty GitHub repo is created, follow the second set of instructions, “Push an existing repository…”

git remote add origin https://github.com/<your_username>/learn-git
git push -u origin master

Now your code is hosted publicly and visible to everyone!

Stages in Git

Let us know visualize what happens when you use the add and commit messages.

Git - Recording Changes to the Repository

This was the entire create and host your personal project workflow. You now have an understanding of the entire process and can create your own repositories on GitHub. You can find a definition of commands used in this section down below.

Commands

  1. git init: Initializes a new Git repository in the current working directory. This creates a new .git directory that contains all the necessary files and subdirectories for Git to track changes in your project.

  2. git add: Adds changes to the staging area, which is a temporary storage area for changes before they are committed. You can use this command to add specific files or directories, or you can use git add . it to add all changes in the current directory.

  3. git status: Displays the current status of the working directory, including any changes that have been made but not yet staged, any changes that have been staged but not committed, and any untracked files. This command is useful for keeping track of what changes have been made and what still needs to be done.

  4. git commit: Commits the changes in the staging area to the local Git repository. This creates a new commit with a unique identifier, a commit message, and a snapshot of the changes that were added to the staging area.

  5. git remote set origin: Sets the remote repository that Git will use for pushing and pulling changes. This command sets the URL of the remote repository and gives it the name "origin", which is the default name for the primary remote repository.

  6. git push: Pushes the committed changes to the remote repository. This sends the changes to the remote repository and updates the branch that you are working on. You can specify the remote repository and branch using the command git push <remote> <branch>.


Workflow #2: Contributing

This is the one you are mostly going to use when you are trying to contribute to an open-source repository. Contribution typically involves the following steps:

  1. Fork the repository: Forking creates a copy of the original repository under your own account, which you can work on independently. To do this, navigate to the repository's GitHub page and click the "Fork" button in the upper right corner.

  2. Clone the fork: Once you've forked the repository, you'll want to clone it to your local machine so you can make changes to it. To do this, run the following command in your terminal, replacing your-username with your GitHub username and repository-name with the name of the repository you forked:

     git clone git@github.com:your-username/repository-name.git
    
  3. Create a new branch: It's generally a good idea to create a new branch for each set of changes you make. To do this, run the following command, replacing new-branch-name with a descriptive name for your new branch:

     git checkout -b new-branch-name
    

    ⬆️ Visualization of branches in Git

  4. Make changes: Now that you have the repository cloned and a new branch checked out, you can make changes to the code. Use your preferred text editor or IDE to edit the files.

  5. Commit your changes: Once you've made the changes, you'll want to commit them to your local repository. To commit, refer to the personal workflow section of the blog.

     git commit -m "commit-message"
    
  6. Push the changes: After committing your changes, you'll want to push them to your forked repository on GitHub. To do this, run the following command, replacing new-branch-name with the name of the branch, you created:

     git push -u origin new-branch-name
    
  7. Create a pull request: Once you've pushed your changes to your forked repository, you can create a pull request to merge your changes into the original repository. To do this, navigate to the original repository's GitHub page and click the "New pull request" button. Select your fork and the branch you just pushed, and provide a description of your changes.

  8. Respond to feedback: The maintainers of the original repository may request changes or ask questions about your pull request. Be sure to respond in a timely manner and make any requested changes.

  9. Merge your changes: If the maintainers approve your pull request, they will merge your changes into the original repository. Congratulations, you've successfully contributed to an open-source project!

Note that some repositories may have slightly different workflows or conventions, so be sure to check the project's documentation or ask the maintainers if you're unsure.

From the above process, your code is still not in the master branch yet right? Teams usually establish a minimum amount of reviews to get a pull request merged. A reviewer might ask for code changes and, better documentation or anything else. Once you get enough numbers of eyes on your work, they can merge it! You can also send a PR to the master branch directly. Please refer to the documentation of the repository and check for guidelines for contribution.

Commands

  1. git checkout: This command allows you to switch between different branches or versions of your code. When you run git checkout followed by a branch name, Git will replace the contents of your working directory with the version of the code stored in that branch. This is useful when you want to work on a different version of the code, or when you want to create a new branch to work on.

  2. git branch: This command allows you to create, list, and delete branches in your Git repository. When you create a new branch, you create a separate version of the code that can be modified independently of the main branch. This allows you to experiment with changes without affecting the main codebase. You can also use git branch to see a list of all branches in your repository and to switch between them using git checkout.

  3. git merge: This command allows you to combine changes from one branch into another. When you run git merge followed by the name of the branch you want to merge, Git will apply the changes made in that branch to the current branch. This is useful when you want to incorporate changes from a feature branch into the main codebase, or when you want to bring a forked repository up to date with the original repository.


Fun Fact :

gitignore is a file that specifies files or directories that Git should ignore when tracking changes in a repository. This is useful for files that are generated during the development process, such as log files or temporary files, or for sensitive information that should not be committed to the repository, such as API keys or passwords. By listing these files or directories in a .gitignore file, you can ensure that they are not accidentally committed to the repository.

gitkeep, on the other hand, is a file that is used to ensure that an otherwise empty directory is included in a Git repository. Git does not track empty directories, so if you want to include an empty directory in your repository, you can add a .gitkeep file to that directory. This file can be empty, but its presence will ensure that Git tracks the directory and includes it in the repository.


Now you know the two main workflows used by some of the biggest organizations in the world. With these workflows, we can ensure proper integration and collaboration of work where negative collisions are avoided in a workspace of multiple developers.

The knowledge of these tools will definitely help you build and dish out better software and open you up to a thriving community of developers working towards common goals. Let's Git it. If you still have any questions/queries you can reach out to me on my LinkedIn / GitHub / Twitter.

Cheers!

Did you find this article valuable?

Support Saptarshi Bhattacharya by becoming a sponsor. Any amount is appreciated!