Login

Sign Up

Project Simple Linear Regression
_Ujjwal_

Posted on May 4, 2025 | AIML

Project Simple Linear Regression

πŸ’Ό Salary Prediction Using Linear Regression

This project uses a simple machine learning model to predict salaries based on years of experience. It’s ideal for beginners looking to understand supervised learning and regression techniques.

Project Link

πŸ“ View the Salary Prediction Project on Google Colab


πŸ“Š Dataset Overview

File Name: Salary_dataset.csv
Columns:

  • YearsExperience: Number of years someone has worked.
  • Salary: The salary associated with that experience level (in dollars).

This dataset is clean and minimal, making it great for learning linear regression.


🧠 Project Type

  • Category: Supervised Machine Learning
  • Algorithm: Linear Regression
  • Problem Type: Regression (predicting a continuous value)

We’re using the known input feature YearsExperience to predict a continuous output variable Salary.


πŸ”§ Step-by-Step Procedure

1. πŸ› οΈ Importing Required Libraries

We import essential Python libraries like:

  • pandas for data handling
  • numpy for numerical operations
  • matplotlib and seaborn for data visualization
  • sklearn for machine learning models and evaluation

2. πŸ“₯ Loading the Dataset

We load the Salary_dataset.csv file into a pandas DataFrame using:

df = pd.read_csv('Salary_dataset.csv')

3. 🧼 Data Inspection and Cleaning

  • Use df.info() and df.describe() to explore data types and basic statistics.
  • Check for missing values using df.isnull().sum().
  • Drop unnecessary columns like 'Unnamed: 0' if present.

4. πŸ‘€ Exploratory Data Analysis (EDA)

  • Use scatter plots to visualize the relationship between YearsExperience and Salary.
  • Use box plots to identify outliers in both features.

5. πŸ“€ Feature Selection

Split the dataset into:

  • X: Independent variable (YearsExperience)
  • Y: Dependent variable (Salary)

6. πŸ§ͺ Splitting the Dataset

Split the data into training and testing sets using:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=101)

Jai hanuman

4 Reactions

2 Bookmarks

Read next

_Ujjwal_

_Ujjwal_

Dec 14, 24

4 min read

|

Building an Own AI Chatbot: Integrating Custom Knowledge Bases

_Ujjwal_

_Ujjwal_

Dec 15, 24

9 min read

|

Exploratory data analysis with Pandas:Part 1