Introduction
A new drug was developed that was claimed to lower the cholesterol level in humans. A leading heart specialist was interested to know if the claim made by the company selling the drug was accurate. They enlisted the help of 50 patients. They agreed to take part in an experiment in which 25 patients would be randomly allocated to a group that would take the new drug and the other 25 would take an identical looking pill that was a placebo (a sugar pill that will have no effect).
The statistician who was to analyse the data carried out a random allocation of patients so neither the patients nor the doctor researching this situation knew who was taking the drug and who was taking the placebo.
All participants …show more content…
had their cholesterol levels measured before starting the course of pills and then, at the end of two months of taking the pills, they had their cholesterol level measured again. The data collected by the doctor are provided to you.
Part A
1.
Suggest one way the statistician could choose the sample of 50 patients, given that the specialist has access to about 850 patients who have cholesterol levels that the specialist considers to be high. Give reasons for your choice.
One way that the statistician could choose the sample of 50 patients is through random sampling. This is one of the most effective options, because a sample of 50 is too small to effectively use other methods such as stratified random sampling or cluster sampling. Since there is access to a population of 850 people, a systematic sample is unnecessary. Having a random sample means that there is a very low possibility of bias, and it is also a very easy sampling technique to use, minimising cost and effort for the researcher.
2. The experiment is said to be a “double blind” experiment. Undertake some research to investigate what this term means and relate your answer to this experiment.
This experiment is a double blind experiment, which means that participants are separated into groups, one taking the real medicine, and the other the placebo pills. However, neither the participants nor the researchers would know who is taking which drugs, only which group they are in. This has the advantage of minimising bias in the experiment, in order to accurately determine the effect of the medicine.
Part …show more content…
B
1. Produce a single ordered stem plot for the levels returned from all participants in the study.
2.
STEM LEAF
5 0, 2, 2, 3, 5, 5, 6, 7, 8, 8, 9
6 0, 0, 1, 1, 1, 3, 3, 4, 4, 5, 5, 7, 8, 8, 9, 9, 9
7 1, 1, 3, 5, 6, 6, 7, 8, 9
8 0, 0, 0, 2, 3, 3, 4, 5, 6, 6, 8,9
9
10 7
50=5.0 units of cholesterol
3. Produce a back-to-back ordered stem plot for the data returned for participants who took the drug and those who took the placebo.
32=3.2 units of cholesterol
4. Comment on the shape of each distribution.
Both groups are negatively skewed, meaning that there is a larger number of values below the mean than above it.
Part C
When the doctor was asked what information they wanted from the analysis of this data, they said: “As well as being interested in the centre and spread of the distributions, I am also interested in the values in between which the middle 50% of the cholesterol levels fall.”
You need to summarise the data using mean, median, range, interquartile range and standard deviation and place in the table below and then construct box plots for all 3 data sets on the same axis.
All before Took the drug Took the placebo
Mean 6.978 5.032 7.172
Minimum 5 2.3 5.2
Lower Quartile 6 4.55 6.2
Median 6.85 5.1 7.2
Upper Quartile 8 5.65 7.8
Maximum 10.7 6.4 11.2
Range 5.7 4.1 6
Interquartile Range 2 1.1 1.6
Standard Deviation 1.22 0.81 1.36
Part D The statistician was also approached by a surgeon who is to perform a procedure on a patient within the next eight weeks. One of the conditions before surgery requires the cholesterol level of the patient to be between 3.0 and 4.5, but is currently high in the range greater than 8.0.
The surgeon requests an analysis of data to help prescribe the best medication option from 3 available medications.
The data available for each of cholesterol reducing medications are given in terms of a. The value of a will be supplied by your teacher and is different for each student. 0.80 < a < 1.30
Value of a is 0.92
Medicine 1
Week No. 1 2 3 4 5 6
Cholesterol
Level 10.23a
=9.4116 8.38a
=7.7096 6.86a
=6.3112 5.62a
=5.1704 4.6a
=4.232 3.76a
=3.4592
Medicine 2
Week No 1 2 3 4 5 6
Cholesterol
Level 9.07a
=8.3444 6.5a
=5.98 4.66a
=4.2872 3.34a
=3.0728 2.39a
=2.1988 1.71a
=1.5732
Medicine 3
Week No. 1 2 3 4 5 6
Cholesterol
Level 11.01a
=10.1292 12.3a
=11.316 11.01a
=10.1292 7.89a
=7.2588 4.52a
=4.1584 2.08a
=1.9136
• Evaluate the table of values with the value of 'a' supplied by your teacher.
• Medicine 1
Week No. 1 2 3 4 5 6
Cholesterol
Level 9.4116 7.7096 6.3112 5.1704 4.232 3.4592
Medicine 2
Week No 1 2 3 4 5 6
Cholesterol
Level 8.3444 5.98 4.2872 3.0728 2.1988 1.5732
Medicine 3
Week No. 1 2 3 4 5 6
Cholesterol
Level 10.1292 11.316 10.1292 7.2588 4.1584 1.9136
• Draw best fit graphs for each medicine
• Find possible mathematical models to fit the data for each medicine.
The line of best fit for cholesterol medicine 1 had an r2 value of exactly one when using an exponential line of best fit. Since the r2 value is exactly 1, the equation for the line of best fit perfectly matches the results.
The line of best fit for cholesterol medicine 2 again had an r2 value of exactly one when using an exponential line of best fit, which means that the line of best fit perfectly correlates with the results.
The line of best fit for cholesterol medicine 3 had an r2 value of exactly one when using a quintic line of best fit, however, this equation was very complex, so a much simpler cubic function was used for the line of best fit. The r2 value for the cubic function was 0.9998, which is almost perfect correlation, so this very minor loss in accuracy was deemed an acceptable tradeoff for a much simpler equation.
• Discuss the options to help advise the surgeon of the medicine to prescribe the patient.
All medicines reduced the patient’s cholesterol to within the acceptable range between 3-4.5 cholesterol units sometime within the six week trial period, however, each medicine did so at different rates. Cholesterol medicine 3 actually increased the cholesterol levels of the patient in week 2, which may be dangerous and could lead to health complications such as heart disease or strokes, so this medicine will be
rejected.
This leaves medicines 2 and 3, both of which have similar rates of cholesterol level reduction. However, medicine 2 reduces cholesterol faster than medicine 1, and so would most likely be preferable as it means that surgery can be performed earlier. Overall, based off this data, medicine 2 is the recommended medicine because it is the fastest acting drug, and entails no potentially dangerous increases in cholesterol levels.
• Write any assumptions and limitations of the models created from the data.
It is assumed that none of the medicines have any dangerous side effects that may interfere with the surgery, such as decreasing blood pressure or causing inflammation. It is also assumed that the medicine is only needed in order to complete the surgery, and is not intended for long term use, in which case it may lower cholesterol to a riskily low level. It is also assumed that the medicines are roughly the same price, because if one is significantly more expensive then it may not be used, in spite of its superiority.
A limitation to this data are that it is only applicable over a six week period, and so will not be useful after six weeks, so if the surgeon was planning to complete the surgery in the seventh or eighth week they would not have any valid data unless they prescribed the medicine later. This data also does not include any information about the side effects of the medication, or potentially different effects for people of different sex, age or health.