Spaces:

deepm09
/

PortfolioChatbot

Sleeping

App Files Files Community

deepm09 commited on Nov 30, 2023

Commit

692d6a0

1 Parent(s): 716ce49

Upload resume.txt

Browse files

Files changed (1) hide show

resume.txt +86 -0

resume.txt CHANGED Viewed

	@@ -54,3 +54,89 @@ Applied Intelligence for Medical Diagnosing June 2022
54
55	Link - https://www.igi-global.com/chapter/applied-intelligence-for-medical-diagnosing/288808
56

 Link - https://www.igi-global.com/chapter/applied-intelligence-for-medical-diagnosing/288808
+SQL COVID-19 Data Cleaning Project
+github link for SQL COVID-19 Data Cleaning Project - https://github.com/deepmehta922000/SQL_CovidData
+Skills Used: Joints, CTE's, Temp tables, Windows Functions, Aggregate Functions, Creating Views, Converting Data Types
+Overview
+This is a SQL data cleaning project that explores Covid-19 data. The project uses skills such as joins, CTEs, temp tables, window functions, aggregate functions, creating views, and converting data types. It includes various SQL queries to retrieve data, such as total cases vs total deaths, total cases vs population, and countries with the highest infection rate compared to population. The project also breaks things down by continent and shows continents with the highest death count per population. Finally, the project uses temporary tables and views to store data for later visualizations.
+Covid-19 Data Exploration Project: Detailed Explanation
+As the Covid-19 pandemic has affected our world in countless ways, I wanted to use my data analysis skills to gain insights into the pandemic's impact on different countries and continents. To do so, I used a dataset that includes information about Covid-19 cases, deaths, and vaccinations across the world.
+The dataset I used includes information such as location, date, total cases, new cases, total deaths, and population. I started by selecting the data for countries with known continents and then proceeded to explore different relationships between the variables.
+One of the first things I looked at was the likelihood of dying from Covid-19 in a given country. I did this by calculating the death percentage for each country by dividing the total number of deaths by the total number of cases. I then filtered the data to focus on countries with "states" in their names and sorted the results by country and date.
+Next, I explored the percentage of the population that had been infected with Covid-19 in different countries. To do this, I calculated the percentage of the population that had contracted Covid-19 by dividing the total number of cases by the population. I sorted the results by location and date and then looked at countries with the highest infection rates compared to their population.
+After that, I looked at countries with the highest death counts per population. I calculated the total death count for each country by summing the total deaths column and then grouped the results by location. I then sorted the results by total death count in descending order.
+To gain insights into the pandemic's impact on different continents, I created a query that showed the continents with the highest death counts per population. I did this by grouping the data by continent and then calculating the total death count for each continent.
+To get a sense of the global impact of Covid-19, I calculated the total number of cases and deaths worldwide. I did this by summing the new cases and new deaths columns for all locations and dates in the dataset.
+Finally, I looked at the percentage of the population that had received at least one Covid-19 vaccine. To do this analysis, I utilized data on the total number of vaccine doses administered and the population size for each location and date. I calculated the percentage of the population that had received at least one dose of the vaccine by dividing the total number of doses administered by the population and then multiplying the result by 100. I then sorted the results by location and date to see how vaccination rates were progressing over time.
+Throughout this project, I gained a deeper understanding of the impact of Covid-19 on different countries and continents. I was able to identify trends and patterns in the data and draw conclusions about the effectiveness of different strategies for managing the pandemic.
+In conclusion, this project was an excellent opportunity for me to apply my data analysis skills to real-world problem. By exploring the Covid-19 data, I gained a greater understanding of the pandemic's impact on the world and the strategies used to manage it. I hope that my analysis can contribute to the ongoing efforts to combat the pandemic and improve public health.
+PowerBi: Data-Professionals Servey's Dashboard
+Skills Used: Power Query, DAX, ETL, Data Analysis, Visualization, Data Modeling, Excel, Dashboard Design
+githuub link for PowerBi: Data-Professionals Servey's Dashboard
+Overview
+This Power BI project analyzes survey data of data professionals to generate insights on demographics, salary, work-life balance, and challenges faced in the data industry. The project provides interactive dashboards to filter data by demographics and presents findings on country-wise salary distribution, work-life balance satisfaction factors, and entry-level positions in the industry. Overall, the project offers valuable insights for individuals and organizations in the data profession.
+PowerBi: Data-Professionals Servey's Dashboard
+This Power BI project takes as input the survey data of data professionals and generates various analyses based on demographics such as country, happiness with salary, satisfaction with work-life balance, average salary, and how difficult it was to break into the profession.
+Using Power BI's data modeling and visualization tools, the project presents the survey results in an easy-to-understand format. The project includes interactive dashboards that allow the user to filter the data by various demographics such as age, gender, and years of experience.
+One of the main analyses of the project is the country-wise distribution of data professionals and their corresponding salaries. The project provides insights into the average salary of data professionals in different countries and also compares it to the cost of living in those countries.
+The project also analyzes the factors that contribute to the satisfaction of data professionals with their work-life balance. By analyzing the survey data, the project identifies the key factors that influence work-life balance satisfaction and provides recommendations for companies to improve their employees' work-life balance.
+Another important analysis of the project is the difficulty of breaking into the profession. The project analyzes the survey data to understand the challenges faced by data professionals when starting their careers and provides insights into the most common entry-level positions and job requirements in the data industry.
+Overall, this Power BI project provides valuable insights into the data profession by analyzing the survey data of data professionals. The project helps organizations and individuals in the data industry to better understand the demographics, salaries, work-life balance, and challenges faced by data professionals.
+Tableau: AirBnB Data Visualization
+Skills Used: Excel, SQL, Data Cleaning, Data Visualization, Data Analysis, Geographic Mapping, Dashboard Design, Storytelling, Data Interpretation
+tableau link for this project: https://public.tableau.com/app/profile/deep.viral.mehta/viz/AirBnB_16789938821760/Dashboard1
+Overview
+The project involved analyzing Airbnb data for a specific region using Tableau. Key skills that were essential for the project included data cleaning and preparation, data visualization, data analysis using statistical techniques, geographic mapping using Tableau's tools, dashboard design, storytelling, and data interpretation. The end goal was to gain insights about trends, patterns, and relationships in the data and present them in a way that was easy to understand for the audience.
+Tableau AirBnB
+Data cleaning and preparation: This involves taking raw Airbnb data and cleaning it up so that it can be used effectively in Tableau. It also involves joining data from multiple sources.
+Data visualization: This includes using Tableau to create visualizations of the data that are easy to understand and help users to gain insights from the data.
+Data analysis: This involves using statistical techniques and other methods to analyze the Airbnb data and gain insights about trends, patterns, and relationships in the data.
+Geographic mapping: Tableau offers a number of tools for creating geographic maps, so this skill can be important to create visualizations that show the distribution of Airbnb listings across a region.
+Dashboard design: A well-designed dashboard that helps users to quickly and easily understand the insights that is uncovered in the data. This involves using colors, shapes, and other visual elements to draw attention to key information.
+Storytelling: Being able to tell a story with your data is an important skill when using Tableau. This involves understanding the audience and crafting a narrative that helps them to understand the insights that I've uncovered.
+Data interpretation: Finally, it's important to be able to interpret the data that I've collected and analyzed. This involves understanding the context of the data, as well as any limitations or biases that may be present in the data.
+Social Buzz Initiative
+Skills Used: Data Analysis, SQL, Big Data, Data Visualization tools (e.g., Tableau), Accenture Sandbox Database, Machine Learning, NLP, Git, Python (NumPy, Pandas), Data Preprocessing, Logistic Regression, Matplotlib, Seaborn
+Embarking on a personal project about Social Buzz, a dynamic player in social media and content creation, I set to leverage my skills and expertise to address their unique challenges. Founded in 2010 by visionary engineers, Social Buzz has rapidly grown to engage over 500 million active users monthly. Took upon overseeing their scaling process, my project encompassed a thorough audit of their big data practices, strategic recommendations for a successful IPO, and an in-depth analysis of their content categories, pinpointing the top 5 with the highest aggregate popularity. Through tasks ranging from creating a cutting-edge big data best practices presentation to conducting on-site data center audits, I aim to demonstrate my proficiency in guiding Social Buzz through this pivotal phase. As they seek external expertise for their IPO journey and data management, this 3-month engagement aims to showcase my capabilities in navigating the complexities of big data and providing valuable insights for their continued success.
+Logistic Regression Classification on Indian Liver Patient Dataset
+Skills Used: Python, Data preprocessing and cleaning, Logistic regression, Model evaluation and selection, Plotting ROC and PR curves, NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn, Git
+github - https://github.com/deepmehta922000/Two_in_One_Diamond_ILPD
+Used the Indian Liver Patient dataset and preprocessed it to take care of null values and replace categorical attributes using one-hot encoding. Then I divided the dataset into a 70:30 ratio for training and testing. I learned a Logistic Regression classifier on the training set and evaluated it on the test set using various evaluation measures such as accuracy, error rate, TPR, FPR, TNR, FNR, sensitivity, specificity, precision, recall, and F-measure. I also plotted the ROC and PR curves and determined the optimal threshold for achieving a desired evaluation metric. Overall, this project was about building and evaluating a machine learning model for predicting liver disease in patients.
+Predicting Diamond Prices Using Linear Regression
+Skills Used: Python, Linear Regression, Feature Selection, Model Evaluation, Data Analysis, Data Visualization, Statistical Modeling, Machine Learning, Model Evaluation, Interpretation, Communication of Results, Git
+https://github.com/deepmehta922000/Two_in_One_Diamond_ILPD
+In this project, I explored the Diamonds dataset by building a linear regression model to predict the price of a diamond based on its carat size. I experimented with adding explanatory variables and applied transformations to improve the model. Finally, I predicted diamond prices in the test set and calculated the mean absolute error. One limitation of the model is its inability to predict prices accurately 20 years into the future.
+Indian Liver Patient Dataset: Exploratory Analysis and Classification
+Skills Used: Python, NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn ,K-nearest neighbors (K-NN), Decision Tree, Random Forest, Logistic Regression, Artificial Neural Networks (ANNs), One-hot encoding, Cross-validation, Git
+github - https://github.com/deepmehta922000/Two_in_One_Diamond_ILPD
+For the Indian Liver Patient Dataset, exploratory analysis and preprocessing techniques were used to handle missing values, transform categorical variables using one-hot encoding and analyze the correlation between the attributes. Decision Tree and K-NN classifiers were then used to classify the patients into those with and without liver disease. Cross-validation was used to train and evaluate the classifiers on the entire dataset, and other classifiers such as Random Forest, Logistic Regression, and ANNs were also compared. The project helped in understanding how to preprocess and analyze medical datasets and develop predictive models for classification.
+Diamonds Dataset: Proximity Measures and Nearest Neighbors
+Skills Used: Python, NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn ,K-nearest neighbors (K-NN), Decision Tree, Random Forest, Logistic Regression, Artificial Neural Networks (ANNs), One-hot encoding, Cross-validation, Git
+github - https://github.com/deepmehta922000/Two_in_One_Diamond_ILPD
+For the Diamonds dataset, a Python function was designed and implemented to calculate the proximity measure between two data samples. The function was then used to find the k-nearest neighbors for a given data sample. The project helped in gaining insights into the properties of the diamonds dataset
+Outlier Detection
+Skills used : Jupyter Notebook, Numpy, Pandas, Matplotlib, seaborn, scikit-learn, Git
+github- https://github.com/deepmehta922000/OutlierDetection
+I used a scatter plot to visualize a 2D dataset and found clusters in certain regions. I then used Local Outlier Factor algorithm to detect anomalies with default and different parameter settings. Using 2 nearest neighbors identified more outliers. Overall, this project helped me understand the performance of the algorithm under different settings. This project helped me learn about the importance of data visualization in identifying patterns and anomalies. I gained experience working with the Local Outlier Factor algorithm and the impact of parameter tuning on its performance. Additionally, I gained insights into the strengths and limitations of using unsupervised learning algorithms for anomaly detection.