In this exercise, we will look at descriptive statistics and how to explore and summarize data...

50.1K

Verified Solution

Question

Statistics

In this exercise, we will look at descriptive statistics and howto explore and summarize data sets. For this, we use the HeartDisease dataset from the UCI data repository. This dataset consistsof 4 small datasets of people with heart disease admitted to 4hospitals.

For now, we only work with the file. this data consists of 271instances with 7 attributes. The attributes are described asbelow:

Age: age in years

sex: 1 = male; 0 = female

cp: chest pain type

Value 1: typical angina

Value 2: atypical angina

Value 3: non-anginal pain

Value 4: asymptomatic

Trestbps: resting blood pressure

Chol: cholesterol level

Thalach: maximum heart rate achieved

heart_problem: 1= have heart problem; 0=No heart problem

Instruction: Use Microsoft Excel to do your work. Please submityour work as ONE MS excel file and create one tab for eachquestion. Show your work as rigorously as possible. name the fileas lastname_fastname_hw1.excel.

Using the attached data, answer the following questions:

1. How many patients have heart disease? (0.5)

2. What is the average Cholesterol level of people with heartdisease and without heart disease? What is the standard deviation?(1)

3. What is the median and average age of people with,

a. cholesterol higher than 240.0? (0.5)

b. cholesterol higher than 240.0 with heart disease? (0.5)

c. cholesterol higher than 240.0 without heart disease?(0.5)

4. Create a histogram of resting blood pressure. (1)

5. Create boxplots based on the sex of the patients for thefollowing attributes:

a. cholesterol level (1.5)

b. maximum heart rate achieved (1.5)

6. For each Box plot, answer the following questions:

a. What is the H-Spread (Q3-Q1) of cholesterol level for maleand females? (0.5)

b. What are the Lower Hinge and Upper Hinge values for maximumheart rate for male and female? (0.5)

7. In order to find if two attributes are related and theirvalues change together, we can use Scatter plot. Follow theinstruction below and answer the questions:

a. Create two scatter plots of age and resting blood pressurefor people with heart disease and without heart disease. Is thereany visual correlation? (1+1)

b. Calculate the average resting blood pressure of each age(HINT : Use Groupby for age) for people with heart disease. (1)

c. Calculate the average resting blood pressure of each age(HINT : Use Groupby for age) for people without heart disease.(1)

d. Now create two scatter plots using the previous results. Doyou observe a correlation now? Do people without heart disease havehigher blood pressure as they age than people with heart disease?(2)

8.Compare the resting blood pressure of people with heartdisease and without. (1)

LINK TO Data set

https://docs.google.com/document/d/1KYER8cMeWPcOlMJpegWNIDAF4maIAthKTM3Hrpr8rxk/edit?usp=sharing

Answer & Explanation Solved by verified expert
4.4 Ratings (896 Votes)
1Count of patients having heart disease 1012Average cholestrol level of people having heart disease2691881188Average cholestrol level of people not having heart disease2399529412Standard deviation of cholestrol level pf people67657711423Median ageAverage ageCholestrol 240494835251799Cholestrol 240 having heart disease504941935484Cholestrol 240 with    See Answer
Get Answers to Unlimited Questions

Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!

Membership Benefits:
  • Unlimited Question Access with detailed Answers
  • Zin AI - 3 Million Words
  • 10 Dall-E 3 Images
  • 20 Plot Generations
  • Conversation with Dialogue Memory
  • No Ads, Ever!
  • Access to Our Best AI Platform: Flex AI - Your personal assistant for all your inquiries!
Become a Member

Other questions asked by students