In this tutorial, we will focus on the basics of Git and the version control platform GitHub.
Most Data scientists are from heterogenous backgrounds such as Physics, Mathematics, etc., and more commonly are from research and academia. For developing any data science project or product Data scientists need best practices of software engineering for smooth building and maintaining the product process. Git is one of the skills that every software engineer needs to manage the code base efficiently. In this tutorial, we are focusing on Git and GitHub platforms for efficient software engineering practices.
Git is a version control system to track all the code modifications. Github is a widely used version control uses git. You can also check various open-source projects on Github. Git is the technology for performing the tracking and merging changes in a source code. Github is a web-based platform that uses git technology. There are other platforms that are also available just like Github such as GitLab, and Sourcetree.
Modern software development can not possible without a version control system because maintaining multiple folders and subfolders with different states is very difficult. It also causes the risk of losing the important code. The main reason behind the popularity of Git is that it offers a branching concept. Branching allows you to create multiple versions of your work and track them in a structured manner. Each branch is like a parallel world that keeps all the changes in one branch without affecting the other branches until you merge them together.
In this tutorial, we are going to cover the following topics:
you can create a new repository by clicking a + sign in the upper-right corner of any page. Use the drop-down menu, and select the New repository option from the dropdown menu.
We can also create a git repository from the local command line.
mkdir NLPProject
Change the folder to the project directory that you have created in the last step
cd my-novel
Now, initialize the project folder with GIt so that Git can manage the version control.
git init
Now, you have one folder with a .git name.
After this, you can create and save the python files of your project and added those files to the git local repository by using the git add and git commit commands.
Let’s commit your code to the local repository. First, we add the files to the git by using git add
. After, that we will check the status using git status
and finally commit the code using git commit
.
git add app.py
Let’s check the status using git status
.
git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
Let’s commit the code using git commit
. Here, we have used the -m
argument for specifying the comment string that explains what you are committing to the repo.
git commit -m "Adding application file"
Here, we made our first contribution to the project.
We can push the code to the repository using the git push
command.
git push <option> [<Remote URL><branch name><refspec>...]
Let’s push the local code on the remote master branch.
git push origin master
We can pull the code from the remote branch to the local branch git pull
command is used to access the changes (commits)from a remote repository to the local repository.
git pull <option> [<repository URL><refspec>...]
git pull origin master
We can create a new branch using the git branch
command. git branch
command also offers other options such as list, rename, and delete branches.
git branch <new-branch>
git branch sentiment_analysis_demo
In Git, we enjoy the branching option for experimenting with new features in parallel. We can switch between those branches using git checkout
command. Let’s check out to sentiment_analysis_demo branch
git checkout sentiment_analysis_demo
Merging branches is a bit complicated operation. Instead of emerging my branch to master
branch I merge master
to my branch(sentiment_analysis_demo) because if there are any conflicts, I can resolve them in the branch itself and master
branch remains clean. So, first check out the maste
r branch and then merge with your new branch.
git checkout master
git merge sentiment_analysis_demo
In this tutorial, we have understood the concept of Git and GitHub version control platforms. We have focused on how to create Github Repo, commit code, push code, pull code, create a new branch, and merge branch. We have also discussed the Git Dos and Don’ts. For more data science-related articles such as NLP, Python, and Statistics.
In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…
Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…
In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…
Analyze employee churn, Why employees are leaving the company, and How to predict, who will…
Airflow operators are core components of any workflow defined in airflow. The operator represents a…
Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…