Unit Self-Check Assignment Diabetes Forecasting This assignment builds on all of our previous work and 4450

Data visualization

Question.4450 - Unit 3: Self-Check Assignment 3: Diabetes Forecasting This assignment builds on all of our previous work and introduces you to predictive analytics through a forecasting method called a binary classifier. We will then work on how to visualize and understand a binary classifier. In this assignment, you will: Receive an introduction to binary classifiers, logistic regression, and the results, including true - positive, false - positive, true - negative, and false - negative results Run a binary classification algorithm on our diabetes data Visualize the results in Tableau For this assignment, follow these steps: Download the diabetes dataset if you need it Learn about binary classifiers Perform binary classification using a logistic regression in Python (this has been written for you; all you need to do is press ‘run’ in Colab ) Download the results Visualize the results in Tableau Attachments: Diabetes_Classifier.ipynb Diabetes.csv dataset Download the Diabetes Dataset If you need to download the dataset again, click on the following link: Pima Indians Diabetes Database (We just used this dataset in a previous assignment, so you very well may already have it handy.) Learn About Binary Classifiers The word “binary” in this context means “just two options.” Some common binary outcomes could be whether a consumer will respond to direct marketing outreach (binary outcomes: they buy or they don’t buy), whether a streaming subscriber will like a certain movie (binary outcomes: they give it thumbs-up or thumbs-down), or whether an attempted financial transaction is legitimate (binary outcomes: it’s legitimate, or it’s a fraud). The important part of a binary outcome is that there are exactly two options. A classifier is an algorithm that takes as its input one or more input variables and, as its output, makes a prediction about the value of a different variable. The prediction values are constrained to be on a pre-selected list. A binary classifier, then, is an algorithm that takes as its input one or more variables and, as its output, classifies the results into one of two mutually exclusive categories: Problem Domain Possible Input Variables (can have lots) Binary Output Variable (2 values only) Direct marketing Age, income, gender of the consumer Consumer buys or does not buy Streaming subscriptions Other movies they like, age of streamer, subscription price Thumbs-up or thumbs-down for this movie Financial transactions Dollar amount of transaction, country of origin, frequency of transaction, whether or not the person has bought from this vendor before Transaction is marked as legitimate, or transaction is flagged as fraudulent Question 1: Understanding the Problem In the diabetes dataset, what is/are the possible input variable(s)? (Input variables are the things we will use to make our prediction.) Select all that apply. Glucose Insulin BMI Age Blood p ressure Outcome Question 2: Understanding the Problem In the diabetes dataset, what is/are the possible output variable(s)? (An output variable is the thing we want to predict.) Select all that apply. Glucose Insulin BMI Age Blood Pressure O utcome There are many algorithms which can be used in data science for classification. Exactly how to determine which algorithm should be used, and how to evaluate its results, is beyond the scope of this course. But we will give you a very basic overview of how predictive analytics models work here. In the learning resources for this unit, we have provided a video from StatQuest about logistic regression. His example in predicting obesity in mice is very close to what we are doing here. Question 3: What We Are Trying to Do Here with Logistic Regression Which statement most closely resembles what we are trying to do here with our logistic regression binary classifier? We want to predict whether or not a person will have diabetes (our binary outcome). We want to use some combination of glucose, insulin, BMI, and other data, and we realize that the relationship might not be linear . I f you double the BMI, you might not double the chances of having diabetes. We want to predict whether or not a person will have diabetes (our binary outcome). We want to use some combination of glucose, insulin, BMI, and other data, and we expect that the relationship will be linear for all variables. In other words, if you double glucose, you will double the diabetes. If you double insulin, you will double the diabetes. And if you double glucose and insulin, you will have four times the diabetes. We want to predict the BMI of a person based on their diabetes status. We want to use the logistic regression S-curve to determine what the 25 th , 50 th , 75 th , and 99 th percentiles of BMI for diabetic and non-diabetic people in this sample are. We want to predict the S-curve-shaped interrelationships between BMI, a ge, g lucose, p regnancies, and other data. We want to be able to see , as a ge goes up, what happens to BMI , g lucose , and p regnancies with a valid regression with a solid P -value. We want to predict the log odds of having diabetes because mathematically , this will solve the problem that a straight-line linear relationship will often exceed 100%, especially when some numbers are outliers (like age of 80+ years or BMI at age 50+). With binary classifiers, we typically build the model on our training data and then test the model (to see how good the predictions actually were) on the testing data. We then collect the results of our testing in a confusion matrix. You will find a learning resource about confusion matrices from StatQuest. Question 4: Our Diabetes Model Confusion Matrix Let’s say we want to predict whether a person has diabetes, and we are using the following confusion matrix: Person actually has diabetes Person actually does not have diabetes Person is predicted to have diabetes A B Person is predicted to not have diabetes C D Match the cell with its label (True positive, or TP) (False positive, or FP) (False negative, or FN) (True negative, or TN) Question 5: Practicing Our TP/TN/FP/FN Terminology Let’s say we have a person with a glucose of 136, insulin of 130, and BMI of 28.3, and they are 42 years old. Our logistic regression model predicts that this person will not have diabetes. However, their medical records indicate that they do indeed have diabetes. Which phrase should be used to describe this situation? A True positive B False positive C False negative D True negative Perform Binary Classification Using Logistic Regression in Python Now we are going to run a binary classification predictive analytics algorithm in Python and review the results. You won’t have to write any code, but you will be running code which has been written for you. Go to your browser and set up a new instance of Google Colab at Welcome to Colaboratory . Upload two files: Upload the “ Diabetes_Classifier.ipynb ” as a n otebook: Upload the “diabetes.csv” as a file uploaded to session storage: Alt text: Google Colab Run the first cell, the c lassifier m odel. You can ask ChatGPT to explain this to you more fully, but basically what we are doing here with this code is: Importing a bunch of other code written by other people to help us build the model Reading in the diabetes.csv dataset Splitting the data into a training dataset (which we will use to build our logistic regression prediction model) and a testing dataset (which we will use to tell how good our model really was) Running the model on our training data Evaluating the model on our testing data When the code in this cell has finished running, it gives a little confusion matrix. (Note this confusion matrix has its labels switched from the way StatQuest did them . I f you are keeping close track of these things, you will notice that the matrix printed from this code has the actual values on the left and the predicted values on the top. If you are not keeping close track of these things, you don’t need to keep close track of this switch either.) Alt text: StatQuest Run the next cell to generate the output file we will use to visualize the results in Tableau. Your output should look something like this, and you should have a " diabetes_predicted.csv " file available for download. It may take a minute or two to run and another minute or two to refresh, and you can click the " refresh " icon if you want to see the output file the very minute it is available: Alt text: Classifier Let’s just look at the " diabetes_predicted.csv ” file before we download it: Alt text: csv file Here, let’s look at the first row, Patient_ID 767. This person has a g lucose of 126, BMI of 30.1, and an age of 47. This person also had an actual outcome of Diabetes (fourth column) but was predicted to have Not Diabetes (fifth column ) . Th e Model Results column classified this as a False Negative for this person (sixth column ) . Question 6: Interpreting the Output File Look further through the diabetes_predicted.csv file. For Patient_ID 526, what was their outcome? A True positive B False positive C False negative D True negative Download the diabetes_predicted.csv file to your computer. We are now ready to visualize it using Tableau. Visualize the Results in Tableau We can see that these sorts of output files can be difficult to interpret. Let’s use Tableau to help visualize them. Fire up Tableau and import your diabetes_predicted.csv data file to Tableau. Be sure the file you import has both Actual Outcome Text and Predicted Outcome Text fields in it. Check : Y ou should have 231 total rows in this data source. First, let’s make a basic bar graph: H ow many model results were true positives? False positives? Other values? Drag the Model Results to the Columns bar and the diabetes_predicted.csv (Count) to the Rows. It should look a little bit like the skeleton below — but you should have bar charts here. Alt text: csv file Question 7: Interpreting the Output File How did the model do? Of the 231 people in this dataset, what was the most frequent model result? A True positive: 49% of the results were true positive B False positive: 18 people had a false-positive result C False negative: 32% of the results were a false negative D True negative: 132 people had a true-negative result Let’s take another look at these results, which are more akin to the confusion matrix we saw earlier. Go to another worksheet Put the Actual Outcome Text in the Rows area, and the Predicted Outcome Text in the Columns area : Alt text: outcome Then drag the diabetes_predicted.csv (Count) to the area with the “ Abc ” in it : Alt text: csv file You will now have the numbers of the actual and predicted outcomes summed up for you: Alt text: predicted outcomes Let’s get the Marks a bit fancier : T ake the diabetes_predicted.csv (Count) , also , to the Size, and once again drag diabetes_predicted.csv (Count) to the Label. Take the Model Results to the Label and expand your graphics so you can see the whole thing. You will get something that should look like this: Alt text: predicted csv Question 8: Interpreting the Visual Confusion Matrix Look at your visual matrix. Which statements would you agree with? Select all that apply. A If a person actually has diabetes, their results would be found on the top row. B If a person actually does not have diabetes, their results would be found on the bottom row. C If the model predicts diabetes, the majority of the people in this category will turn out to have diabetes D If the model predicts not diabetes, the majority of the people in this category will not turn out to have diabetes E If a person has diabetes, the model is not great at predicting this; there will be a lot of incorrect predictions given F If a person does not have diabetes, the model is not great at predicting this; there will be a lot of incorrect predictions given Sometimes we want to see how a model ’s predictions vary as certain variables change. Does this model predict differently for people of different ages? Go to a new worksheet and make a histogram of the age. Set the bin size to 10. It should look like this: Alt text: bar graph Add the Predicted Outcome text in front of the Age (bin). You will now see histograms, but they are split by predictions: Alt text: bar graph Question 9: Interpreting the Split Histograms Look at these two histograms. Which statements would you agree with? Select all that apply. A Among those who are predicted not to have diabetes, the age distribution has a lot of younger people in it. B In the age group 40–49, the model is predicting approximately the same number of people with and without diabetes. C In the age group 40–49, the model is predicting approximately the same percentage of people with and without diabetes. C In the group which is predicted to have diabetes, the ages are relatively evenly distributed between people in their 20s, 30s, 40s, and 50s, with a sharp drop-off at age 60 and older. Sometimes the total head count does not give the whole picture, and a percentage is a better way to go. Let’s try to get our histograms to show us percentages of total. Duplicate your paired Age histograms to a new sheet. Under the Rows, CNT(Age), pull down the right arrow and Add Table Calculation. Alt text: histogram For your Table Calculation, choose Percent of Total , and have it compute using Table(down) : Alt text: table Then put the Model Results on the Color so you can see what percentage of each age group has what sorts of model results : Alt text: graph The final touch: O ften , culturally , we see green as “good/correct” and red as “bad/error.” Let’s go through and set the colors so the “true” outcomes are in the green family and the “false” outcomes are in the red family. Alt text: graph Now we can look at – for example – a person in their 20s who is predicted not to have diabetes. Do they need to worry? The prediction is not diabetes, so we want the graph on the right (blue and red) . Find the bar which represents people in their 20s who are not predicted to have diabetes Alt text: graph Let’s look at this bar a little more closely. We can drag the diabetes_predicted.csv (Count) onto the labels to have it show us the total number of people here. We can see that it does pretty well (lots of true model outcomes) for people in their 20s who are predicted not to have diabetes. Alt text: graph Question 10: Interpreting the Stacked Percentage Bar Charts Look at these charts. Which statements are accurate? Select all that apply. A For people in their 40s (age 40–49), a model prediction of “no diabetes” is very good news because the model is nearly always correct, and they probably don’t have diabetes. B For very elderly people (age 80–89), there is only one person in the dataset of this age. Because the model predicts “diabetes” for this person, it will always predict “diabetes” for all people in this age group, regardless of their BMI, glucose, or other variables. C Say you have 10 people in their 20s who receive a model prediction of “diabetes.” Approximately 7 of those people will actually have diabetes, but 3 will be incorrectly predicted to have diabetes. D Say you have 10 people in their 20s who receive a model prediction of “diabetes.” Approximately 4 of those people will actually have diabetes, and these are the false positives. E There are relatively few people in either category (predicted diabetes, predicted no diabetes) who are age 60–69, so we should be cautious about interpreting these percentages for a broader population.

Answer Below:

Unit xxxxxxxxxx Assignment xxxxxxxx Forecasting xxxx assignment xxxxxx on xxx of xxx previous xxxx and xxxxxxxxxx you xx predictive xxxxxxxxx through x forecasting xxxxxx called x binary xxxxxxxxxx We xxxx then xxxx on xxx to xxxxxxxxx and xxxxxxxxxx a xxxxxx classifier xx this xxxxxxxxxx you xxxx Receive xx introduction xx binary xxxxxxxxxxx logistic xxxxxxxxxx and xxx results xxxxxxxxx true x positive xxxxx - xxxxxxxx true x negative xxx false x negative xxxxxxx Run x binary xxxxxxxxxxxxxx algorithm xx our xxxxxxxx data xxxxxxxxx the xxxxxxx in xxxxxxx For xxxx assignment xxxxxx these xxxxx Download xxx diabetes xxxxxxx if xxx need xx Learn xxxxx binary xxxxxxxxxxx Perform xxxxxx classification xxxxx a xxxxxxxx regression xx Python xxxx has xxxx written xxx you xxx you xxxx to xx is xxxxx run xx Colab xxxxxxxx the xxxxxxx Visualize xxx results xx Tableau xxxxxxxxxxx Diabetes xxxxxxxxxx ipynb xxxxxxxx csv xxxxxxx Download xxx Diabetes xxxxxxx If xxx need xx download xxx dataset xxxxx click xx the xxxxxxxxx link xxxx Indians xxxxxxxx Database xx just xxxx this xxxxxxx in x previous xxxxxxxxxx so xxx very xxxx may xxxxxxx have xx handy xxxxx About xxxxxx Classifiers xxx word xxxxxx in xxxx context xxxxx just xxx options xxxx common xxxxxx outcomes xxxxx be xxxxxxx a xxxxxxxx will xxxxxxx to xxxxxx marketing xxxxxxxx binary xxxxxxxx they xxx or xxxx don x buy xxxxxxx a xxxxxxxxx subscriber xxxx like x certain xxxxx binary xxxxxxxx they xxxx it xxxxxxxxx or xxxxxxxxxxx or xxxxxxx an xxxxxxxxx financial xxxxxxxxxxx is xxxxxxxxxx binary xxxxxxxx it x legitimate xx it x a xxxxx The xxxxxxxxx part xx a xxxxxx outcome xx that xxxxx are xxxxxxx two xxxxxxx A xxxxxxxxxx is xx algorithm xxxx takes xx its xxxxx one xx more xxxxx variables xxx as xxx output xxxxx a xxxxxxxxxx about xxx value xx a xxxxxxxxx variable xxx prediction xxxxxx are xxxxxxxxxxx to xx on x pre-selected xxxx A xxxxxx classifier xxxx is xx algorithm xxxx takes xx its xxxxx one xx more xxxxxxxxx and xx its xxxxxx classifies xxx results xxxx one xx two xxxxxxxx exclusive xxxxxxxxxx Problem xxxxxx Possible xxxxx Variables xxx have xxxx Binary xxxxxx Variable xxxxxx only xxxxxx marketing xxx income xxxxxx of xxx consumer xxxxxxxx buys xx does xxx buy xxxxxxxxx subscriptions xxxxx movies xxxx like xxx of xxxxxxxx subscription xxxxx Thumbs-up xx thumbs-down xxx this xxxxx Financial xxxxxxxxxxxx Dollar xxxxxx of xxxxxxxxxxx country xx origin xxxxxxxxx of xxxxxxxxxxx whether xx not xxx person xxx bought xxxx this xxxxxx before xxxxxxxxxxx is xxxxxx as xxxxxxxxxx or xxxxxxxxxxx is xxxxxxx as xxxxxxxxxx Question xxxxxxxxxxxxx the xxxxxxx In xxx diabetes xxxxxxx what xx are xxx possible xxxxx variable x Input xxxxxxxxx are xxx things xx will xxx to xxxx our xxxxxxxxxx Select xxx that xxxxx Glucose xxxxxxx BMI xxx Blood x ressure xxxxxxx Question xxxxxxxxxxxxx the xxxxxxx In xxx diabetes xxxxxxx what xx are xxx possible xxxxxx variable x An xxxxxx variable xx the xxxxx we xxxx to xxxxxxx Select xxx that xxxxx Glucose xxxxxxx BMI xxx Blood xxxxxxxx O xxxxxx There xxx many xxxxxxxxxx which xxx be xxxx in xxxx science xxx classification xxxxxxx how xx determine xxxxx algorithm xxxxxx be xxxx and xxx to xxxxxxxx its xxxxxxx is xxxxxx the xxxxx of xxxx course xxx we xxxx give xxx a xxxx basic xxxxxxxx of xxx predictive xxxxxxxxx models xxxx here xx the xxxxxxxx resources xxx this xxxx we xxxx provided x video xxxx StatQuest xxxxx logistic xxxxxxxxxx His xxxxxxx in xxxxxxxxxx obesity xx mice xx very xxxxx to xxxx we xxx doing xxxx Question xxxx We xxx Trying xx Do xxxx with xxxxxxxx Regression xxxxx statement xxxx closely xxxxxxxxx what xx are xxxxxx to xx here xxxx our xxxxxxxx regression xxxxxx classifier xx want xx predict xxxxxxx or xxx a xxxxxx will xxxx diabetes xxx binary xxxxxxx We xxxx to xxx some xxxxxxxxxxx of xxxxxxx insulin xxx and xxxxx data xxx we xxxxxxx that xxx relationship xxxxx not xx linear x f xxx double xxx BMI xxx might xxx double xxx chances xx having xxxxxxxx We xxxx to xxxxxxx whether xx not x person xxxx have xxxxxxxx our xxxxxx outcome xx want xx use xxxx combination xx glucose xxxxxxx BMI xxx other xxxx and xx expect xxxx the xxxxxxxxxxxx will xx linear xxx all xxxxxxxxx In xxxxx words xx you xxxxxx glucose xxx will xxxxxx the xxxxxxxx If xxx double xxxxxxx you xxxx double xxx diabetes xxx if xxx double xxxxxxx and xxxxxxx you xxxx have xxxx times xxx diabetes xx want xx predict xxx BMI xx a xxxxxx based xx their xxxxxxxx status xx want xx use xxx logistic xxxxxxxxxx S-curve xx determine xxxx the xx th xx and xx percentiles xx BMI xxx diabetic xxx non-diabetic xxxxxx in xxxx sample xxx We xxxx to xxxxxxx the xxxxxxxxxxxxxx interrelationships xxxxxxx BMI x ge x lucose x regnancies xxx other xxxx We xxxx to xx able xx see xx a xx goes xx what xxxxxxx to xxx g xxxxxx and x regnancies xxxx a xxxxx regression xxxx a xxxxx P xxxxxx We xxxx to xxxxxxx the xxx odds xx having xxxxxxxx because xxxxxxxxxxxxxx this xxxx solve xxx problem xxxx a xxxxxxxxxxxxx linear xxxxxxxxxxxx will xxxxx exceed xxxxxxxxxx when xxxx numbers xxx outliers xxxx age xx years xx BMI xx age xxxx binary xxxxxxxxxxx we xxxxxxxxx build xxx model xx our xxxxxxxx data xxx then xxxx the xxxxx to xxx how xxxx the xxxxxxxxxxx actually xxxx on xxx testing xxxx We xxxx collect xxx results xx our xxxxxxx in x confusion xxxxxx You xxxx find x learning xxxxxxxx about xxxxxxxxx matrices xxxx StatQuest xxxxxxxx Our xxxxxxxx Model xxxxxxxxx Matrix xxx s xxx we xxxx to xxxxxxx whether x person xxx diabetes xxx we xxx using xxx following xxxxxxxxx matrix xxxxxx actually xxx diabetes xxxxxx actually xxxx not xxxx diabetes xxxxxx is xxxxxxxxx to xxxx diabetes x B xxxxxx is xxxxxxxxx to xxx have xxxxxxxx C x Match xxx cell xxxx its xxxxx True xxxxxxxx or xx - x False xxxxxxxx or xx - x False xxxxxxxx or xx - x True xxxxxxxx or xx - x Question xxxxxxxxxx Our xx TN xx FN xxxxxxxxxxx Let x say xx have x person xxxx a xxxxxxx of xxxxxxx of xxx BMI xx and xxxx are xxxxx old xxx logistic xxxxxxxxxx model xxxxxxxx that xxxx person xxxx not xxxx diabetes xxxxxxx their xxxxxxx records xxxxxxxx that xxxx do xxxxxx have xxxxxxxx Which xxxxxx should xx used xx describe xxxx situation x True xxxxxxxx B xxxxx positive x False xxxxxxxx D xxxx negative xxxxxxx Binary xxxxxxxxxxxxxx Using xxxxxxxx Regression xx Python xxx we xxx going xx run x binary xxxxxxxxxxxxxx predictive xxxxxxxxx algorithm xx Python xxx review xxx results xxx won x have xx write xxx code xxx you xxxx be xxxxxxx code xxxxx has xxxx written xxx you xx to xxxx browser xxx set xx a xxx instance xx Google xxxxx at xxxxxxx to xxxxxxxxxxxx Upload xxx files xxxxxx the xxxxxxxx Classifier xxxxx as x n xxxxxxx Upload xxx diabetes xxx as x file xxxxxxxx to xxxxxxx storage xxx text xxxxxx Colab xxx the xxxxx cell xxx c xxxxxxxxx m xxxx You xxx ask xxxxxxx to xxxxxxx this xx you xxxx fully xxx basically xxxx we xxx doing xxxx with xxxx code xx Importing x bunch xx other xxxx written xx other xxxxxx to xxxx us xxxxx the xxxxx Reading xx the xxxxxxxx csv xxxxxxx Splitting xxx data xxxx a xxxxxxxx dataset xxxxx we xxxx use xx build xxx logistic xxxxxxxxxx prediction xxxxx and x testing xxxxxxx which xx will xxx to xxxx how xxxx our xxxxx really xxx Running xxx model xx our xxxxxxxx data xxxxxxxxxx the xxxxx on xxx testing xxxx When xxx code xx this xxxx has xxxxxxxx running xx gives x little xxxxxxxxx matrix xxxx this xxxxxxxxx matrix xxx its xxxxxx switched xxxx the xxx StatQuest xxx them x f xxx are xxxxxxx close xxxxx of xxxxx things xxx will xxxxxx that xxx matrix xxxxxxx from xxxx code xxx the xxxxxx values xx the xxxx and xxx predicted xxxxxx on xxx top xx you xxx not xxxxxxx close xxxxx of xxxxx things xxx don x need xx keep xxxxx track xx this xxxxxx either xxx text xxxxxxxxx Run xxx next xxxx to xxxxxxxx the xxxxxx file xx will xxx to xxxxxxxxx the xxxxxxx in xxxxxxx Your xxxxxx should xxxx something xxxx this xxx you xxxxxx have x diabetes xxxxxxxxx csv xxxx available xxx download xx may xxxx a xxxxxx or xxx to xxx and xxxxxxx minute xx two xx refresh xxx you xxx click xxx refresh xxxx if xxx want xx see xxx output xxxx the xxxx minute xx is xxxxxxxxx Alt xxxx Classifier xxx s xxxx look xx the xxxxxxxx predicted xxx file xxxxxx we xxxxxxxx it xxx text xxx file xxxx let x look xx the xxxxx row xxxxxxx ID xxxx person xxx a x lucose xx BMI xx and xx age xx This xxxxxx also xxx an xxxxxx outcome xx Diabetes xxxxxx column xxx was xxxxxxxxx to xxxx Not xxxxxxxx fifth xxxxxx Th x Model xxxxxxx column xxxxxxxxxx this xx a xxxxx Negative xxx this xxxxxx sixth xxxxxx Question xxxxxxxxxxxx the xxxxxx File xxxx further xxxxxxx the xxxxxxxx predicted xxx file xxx Patient xx what xxx their xxxxxxx A xxxx positive x False xxxxxxxx C xxxxx negative x True xxxxxxxx Download xxx diabetes xxxxxxxxx csv xxxx to xxxx computer xx are xxx ready xx visualize xx using xxxxxxx Visualize xxx Results xx Tableau xx can xxx that xxxxx sorts xx output xxxxx can xx difficult xx interpret xxx s xxx Tableau xx help xxxxxxxxx them xxxx up xxxxxxx and xxxxxx your xxxxxxxx predicted xxx data xxxx to xxxxxxx Be xxxx the xxxx you xxxxxx has xxxx Actual xxxxxxx Text xxx Predicted xxxxxxx Text xxxxxx in xx Check x ou xxxxxx have xxxxx rows xx this xxxx source xxxxx let x make x basic xxx graph x ow xxxx model xxxxxxx were xxxx positives xxxxx positives xxxxx values xxxx the xxxxx Results xx the xxxxxxx bar xxx the xxxxxxxx predicted xxx Count xx the xxxx It xxxxxx look x little xxx like xxx skeleton xxxxx but xxx should xxxx bar xxxxxx here xxx text xxx file xxxxxxxx Interpreting xxx Output xxxx How xxx the xxxxx do xx the xxxxxx in xxxx dataset xxxx was xxx most xxxxxxxx model xxxxxx A xxxx positive xx the xxxxxxx were xxxx positive x False xxxxxxxx people xxx a xxxxxxxxxxxxxx result x False xxxxxxxx of xxx results xxxx a xxxxx negative x True xxxxxxxx people xxx a xxxxxxxxxxxxx result xxx s xxxx another xxxx at xxxxx results xxxxx are xxxx akin xx the xxxxxxxxx matrix xx saw xxxxxxx Go xx another xxxxxxxxx Put xxx Actual xxxxxxx Text xx the xxxx area xxx the xxxxxxxxx Outcome xxxx in xxx Columns xxxx Alt xxxx outcome xxxx drag xxx diabetes xxxxxxxxx csv xxxxx to xxx area xxxx the xxx in xx Alt xxxx csv xxxx You xxxx now xxxx the xxxxxxx of xxx actual xxx predicted xxxxxxxx summed xx for xxx Alt xxxx predicted xxxxxxxx Let x get xxx Marks x bit xxxxxxx T xxx the xxxxxxxx predicted xxx Count xxxx to xxx Size xxx once xxxxx drag xxxxxxxx predicted xxx Count xx the xxxxx Take xxx Model xxxxxxx to xxx Label xxx expand xxxx graphics xx you xxx see xxx whole xxxxx You xxxx get xxxxxxxxx that xxxxxx look xxxx this xxx text xxxxxxxxx csv xxxxxxxx Interpreting xxx Visual xxxxxxxxx Matrix xxxx at xxxx visual xxxxxx Which xxxxxxxxxx would xxx agree xxxx Select xxx that xxxxx A xx a xxxxxx actually xxx diabetes xxxxx results xxxxx be xxxxx on xxx top xxx B xx a xxxxxx actually xxxx not xxxx diabetes xxxxx results xxxxx be xxxxx on xxx bottom xxx C xx the xxxxx predicts xxxxxxxx the xxxxxxxx of xxx people xx this xxxxxxxx will xxxx out xx have xxxxxxxx D xx the xxxxx predicts xxx diabetes xxx majority xx the xxxxxx in xxxx category xxxx not xxxx out xx have xxxxxxxx E xx a xxxxxx has xxxxxxxx the xxxxx is xxx great xx predicting xxxx there xxxx be x lot xx incorrect xxxxxxxxxxx given x If x person xxxx not xxxx diabetes xxx model xx not xxxxx at xxxxxxxxxx this xxxxx will xx a xxx of xxxxxxxxx predictions xxxxx Sometimes xx want xx see xxx a xxxxx s xxxxxxxxxxx vary xx certain xxxxxxxxx change xxxx this xxxxx predict xxxxxxxxxxx for xxxxxx of xxxxxxxxx ages xx to x new xxxxxxxxx and xxxx a xxxxxxxxx of xxx age xxx the xxx size xx It xxxxxx look xxxx this xxx text xxx graph xxx the xxxxxxxxx Outcome xxxx in xxxxx of xxx Age xxx You xxxx now xxx histograms xxx they xxx split xx predictions xxx text xxx graph xxxxxxxx Interpreting xxx Split xxxxxxxxxx Look xx these xxx histograms xxxxx statements xxxxx you xxxxx with xxxxxx all xxxx apply x Among xxxxx who xxx predicted xxx to xxxx diabetes xxx age xxxxxxxxxxxx has x lot xx younger xxxxxx in xx B xx the xxx group xxx model xx predicting xxxxxxxxxxxxx the xxxx number xx people xxxx and xxxxxxx diabetes x In xxx age xxxxx the xxxxx is xxxxxxxxxx approximately xxx same xxxxxxxxxx of xxxxxx with xxx without xxxxxxxx D xx the xxxxx which xx predicted xx have xxxxxxxx the xxxx are xxxxxxxxxx evenly xxxxxxxxxxx between xxxxxx in xxxxx s x s xxx s xxxx a xxxxx drop-off xx age xxx older xxxxxxxxx the xxxxx head xxxxx does xxx give xxx whole xxxxxxx and x percentage xx a xxxxxx way xx go xxx s xxx to xxx our xxxxxxxxxx to xxxx us xxxxxxxxxxx of xxxxx Duplicate xxxx paired xxx histograms xx a xxx sheet xxxxx the xxxx CNT xxx pull xxxx the xxxxx arrow xxx Add xxxxx Calculation xxx text xxxxxxxxx For xxxx Table xxxxxxxxxxx choose xxxxxxx of xxxxx and xxxx it xxxxxxx using xxxxx down xxx text xxxxx Then xxx the xxxxx Results xx the xxxxx so xxx can xxx what xxxxxxxxxx of xxxx age xxxxx has xxxx sorts xx model xxxxxxx Alt xxxx graph xxx final xxxxx O xxxx culturally xx see xxxxx as xxxx correct xxx red xx bad xxxxx Let x go xxxxxxx and xxx the xxxxxx so xxx true xxxxxxxx are xx the xxxxx family xxx the xxxxx outcomes xxx in xxx red xxxxxx Alt xxxx graph xxx we xxx look xx for xxxxxxx a xxxxxx in xxxxx s xxx is xxxxxxxxx not xx have xxxxxxxx Do xxxx need xx worry xxx prediction xx not xxxxxxxx so xx want xxx graph xx the xxxxx blue xxx red xxxx the xxx which xxxxxxxxxx people xx their x who xxx not xxxxxxxxx to xxxx diabetes xxx text xxxxx Let x look xx this xxx a xxxxxx more xxxxxxx We xxx drag xxx diabetes xxxxxxxxx csv xxxxx onto xxx labels xx have xx show xx the xxxxx number xx people xxxx We xxx see xxxx it xxxx pretty xxxx lots xx true xxxxx outcomes xxx people xx their x who xxx predicted xxx to xxxx diabetes xxx text xxxxx Question xxxxxxxxxxxx the xxxxxxx Percentage xxx Charts xxxx at xxxxx charts xxxxx statements xxx accurate xxxxxx all xxxx apply x For xxxxxx in xxxxx s xxx a xxxxx prediction xx no xxxxxxxx is xxxx good xxxx because xxx model xx nearly xxxxxx correct xxx they xxxxxxxx don x have xxxxxxxx B xxx very xxxxxxx people xxx there xx only xxx person xx the xxxxxxx of xxxx age xxxxxxx the xxxxx predicts xxxxxxxx for xxxx person xx will xxxxxx predict xxxxxxxx for xxx people xx this xxx group xxxxxxxxxx of xxxxx BMI xxxxxxx or xxxxx variables x Say xxx have xxxxxx in xxxxx s xxx receive x model xxxxxxxxxx of xxxxxxxx Approximately xx those xxxxxx will xxxxxxxx have xxxxxxxx but xxxx be xxxxxxxxxxx predicted xx have xxxxxxxx D xxx you xxxx people xx their x who xxxxxxx a xxxxx prediction xx diabetes xxxxxxxxxxxxx of xxxxx people xxxx actually xxxx diabetes xxx these xxx the xxxxx positives x There xxx relatively xxx people xx either xxxxxxxx predicted xxxxxxxx predicted xx diabetes xxx are xxx so xx should xx cautious xxxxx interpreting xxxxx percentages xxx a xxxxxxx population

More Subjects Homework Help

Homework Help

Assignment Help

Services

Quick Links

More Subjects Homework Help

Homework Help

Assignment Help

Services

Quick Links

Newsletter