# Expert Answer:MIS 690 GCU Model Deployment and Model Life Cycle

Solved by verified expert:Write a 500-750 word paper describing the model deployment and model life cycle aspects of your model. It will include the following:What are model deployment costs? Be specific.What is a proposed task and timeline for deploying your model?What specific training will be required for those who will be using the model on a regular basis?Can this model be used on a repetitive basis? Explain.How will model quality be tracked over time?How will the model be re-calibrated and maintained over time?What specific benefits to the organization will be realized over time as a result of using the model?***Please keep in mind the Model we choose was the Logistic Regression model, so you would be talking about that model for this assignment. I also attached the previous papers that should help you understand more of the modeling and problem we are looking to solve. Please let me know if you have any other questions.***I also attached an EXAMPLE of what this assignment could possibly look like or how it might be laid out. This example was directly from the instructor.
team_fire___model_validation_rough_draft_needs_external_validation.docx

week_7___example___model_deployment___life_cycle___j_debruyn__1_.docx

Don't use plagiarized sources. Get Your Custom Essay on
Expert Answer:MIS 690 GCU Model Deployment and Model Life Cycle
Just from \$10/Page

week_6___model_building.docx

Unformatted Attachment Preview

1
Model Building
Darren Mans
Renee Taillon
Shelbea Rainbolt
Thomas Salmons
Grand Canyon University: MIS 690
March 10, 2019
MODEL VALIDATION
2
Model Validation
Four predictive models: Logistic Regression, Neural Net, CHAID and Discriminant were
built in IBM SPSS to predict the burn patient status at time of hospital discharge with the
requirement of: 1. a minimal model accuracy of 70% and 2. model overfitting is negated. The
datasets were segmented to investigate the concentration of burn patients under 5 years old. The
decision was made post analysis during a stakeholder meeting that the focus for project moving
forward would be to only consider the entire, unsegmented dataset. The models that focused on
the young age group were suspicious and produced highly sensitive models due to a very small
amount of historical data for the dependent variable in the training dataset; e.g., 5 records with the
status of dead. In addition, the input variable TBSA does account for a discrepancy between the
young patients and the older patients by its definition. The TBSA is based on the Wallace Rule of
Nines, which has a different set of criteria between adults and young children. (Radiation
Emergency Medical Management, n./a.). The final models under consideration only included
variables with significances of 0.05 or less and had accuracies ranging from 91% to 94%; all
highly accurate. These model accuracies met the 70% accuracy criterion in the business problem
statement. Due to the closeness and seemingly robustness of all the models, it was a challenge to
choose just one model. The Logistic Regression model was ultimately chosen since we
considered false positives in the status of dead or alive medical models, by being the most
conservative. We highlight the results of the internal validation by investigating the accuracy,
sensitivity, and specificity; as well as, discuss external validation.
Internal Model Validation
“Scholars have defined a series of methods through which to validate the results obtained
from a logistic regression model”. (Giancristoforo, et al, 2007). In data science it is not
MODEL VALIDATION
3
acceptable to evaluate the performance of a model with the same data that is used to train the
model because “it can easily generate over-optimistic and overfitted models”. (Bulriss, 2018). We
chose the hold-out method of the two (hold-out or cross-validation) for internal model validation
during the model build in IBM SPSS because it is applicable to logistic regression models and it’s
a validation feature built into the SPSS tool. (IBM SPSS, 2019). The 1000 record dataset was
partition into 70% for training, 20% for testing, and 10% for internal model validation. The SPSS
model stream includes the Partition node following the Type node with the parameters shown in
Figure 1.
Figure 1
Figure 1. Figure 2 shows Case Processing Summary history that indicates the number of cases
included in our analysis. The eighth row tells us that zero records are missing data in variables
included in our analysis. Figure 3 shows the accuracy of each of the partitioned datasets for
MODEL VALIDATION
4
Figure 2. Cast Processing Summer for Revised Model – Full Dataset
Figure 2
training, testing and validation and are 93%, 90.5%, and 92%, respectively. Recognize that the
testing accuracy is less than the training accuracy, which is another good indicator of not having
an overfit model. The validation accuracy is 92%, which meets the business problem criterion of a
minimum of 70%. Using a separate validation dataset meets the second criteria of not having an
overfitted model and one of the main reasons for using a hold-out dataset.
Figure 3. Comparison between Predicted and Actual Dependent Variable, Status and Accuracy.
Figure 3
model. Figure 3 illustrates the Confusion Table or Coincidence Matrix for this revised model,
Logistics Regression excluding Race and Gender predictors.
MODEL VALIDATION
5
Figure 4. Confusion Matrix – Full Dataset/Revised Model – Logistic Regression
Figure 4
The Total Positive (TP), Total Negative (TN), False Positive (FP) and False Negative
(FN) are used to determine model sensitivity, specificity, accuracy by the following equations:
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑃
85
= 85+7 = 0.924
𝑇𝑃+𝐹𝑁
𝑇𝑁
7
=
= 0.875
𝑇𝑁 + 𝐹𝑃
7+1
The validation dataset has a sensitivity of 92.4%, which is the proportion of actual positive cases
correctly identified. A specificity of 87.5% is the proportion of actual negative cases correctly
identified. The chi-squared is the maximum likelihood estimate of the parameters that is
Figure 4. Model Fitting for Full Dataset Revised Model
Figure 4
MODEL VALIDATION
6
compared to the predicted value is 340.375. This is a high value and since the p-value is less than
our chosen significance level α = 0.05, we can reject the null hypothesis and conclude that there is
an association between the independent variables and the patient’s status at hospital discharge.
The “-2 Log Likelihood” is the “log-likelihood multiplied by -2 and is commonly used to explore
how well a logistic regression model fits the data. The lower the value is, the better your model is
at predicting your binary outcome variable, which is reflected a lower number 233.788. (Strand,
et al, n./a.). It does have a 50% lower value than a model with the intercept only, 574.13). Figure
5 lists the pseudo R-square value for the Nagelkerke value, usually the more important type of
Figure 5. Pseudo R-Square
Figure 5
indicator of the pseudo-R values and is 0.69. Figure 6, the model parameter estimates list another
important statistical consideration, the Wald statistic (to test the statistical significance). You can
see that Wald statistics for TBSA and Age are very high, followed by Inhalation and Flame and
relatively the order of predictor important found in many of the other models , CHAID and Neural
Net, as well.
MODEL VALIDATION
7
Figure 6. Parameter Estimate List of Revised Model
Figure 6
Figures 7 and 8 are the cumulative Gains and Lift graphs for the model. “For a good
model, the gains chart will rise steeply toward 100% and then level off”, which is shown in Figure
7 and validates our model as a good model. (IBM SPSS, 2019). “For a good model, lift should
start well above 1.0 on the left, remain on a high plateau as you move to the right, and then trail
Figure 7. Cumulative Gains Chart.
Figure 7
MODEL VALIDATION
8
Figure 8. Lift Graph for Revised Model for Full Dataset.
Figure 8
off sharply toward 1.0 on the right side of the chart. For a model that provides no information, the
line will hover around 1.0 for the entire graph”. (IBM SPSS, 2019). Figure 8 also reflects a good
model choice.
Figure 9, the AUC, area under the curve, and the GINI score is used to score how well the
Figure 9. AUC and GINI Score for Revised Model – Full Dataset
Figure 9
model describes the data. The GINI score in the range of 0.923 – 0.095, for all the partitioned
datasets, is good and is another confirmation of a good model.
MODEL VALIDATION
9
Predictive Modeling for Logistic Regression
Conclusion
The dataset was partitioned for training, testing, and data validation resulting in accuracies
of 93%, 90.5%, and 92%, respectively. The Nagelkerke pseudo-R is 0.69. The Wald statistics
were reviewed along with the variable p-values. The null hypothesis is that the model is a ‘good
enough’ fit to the data and we will only reject this null hypothesis (i.e. decide it is a ‘poor’ fit) if

## Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
\$26
The price is based on these factors:
Number of pages
Urgency
Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

# Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

### Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

### Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

### Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.