Data Management and Visualization: Creating Graphs for my data
For the week 4 assignment of the "Data Management and Visualization" course, I am to create and analyze some graphs on the variables of the NESARC dataset.
First of all, I will write a brief reminder of the synthesized codebook I am using and the question(s) I have posed.
The codebook:
SMOKER - TOBACCO USE STATUS: Current user=1; Ex-user=2;
S3AQ3C1 - Usual Quantity when Smoked Cigarettes;
USFREQMO - Usual Frequency when Smoked Cigarettes (per month);
NUMCIGMO_EST - Estimated number of cigarettes smoked per month;
The variable S3AQ3D1R (indicating Duration (Days) of Usual Cigarette Smoking) was dropped out because of its very high percentage of Unknown and missing data (78.59).
The posed quentions were "Have ex-cigarette smokers smoked, on average, less cigarettes than current ones?", where less is percieved with a couple of meanings: "less as a count" and "less as briefer duration periods". After excluding the variable S3AQ3D1R, I am no longer able to answer for the second meaning of the "meanings of less" part of the question - that is "less as briefer duration periods". So, I am going to concentrate on the first meaning of less - "as a count".
I will start by creating univariate graphs for all the variables in my working dataset.
Notice that I change the type of values for the categorical variable SMOKER and rename these same values as "Current" and "Ex-user" instead of 1 and 0.From these charts we can conclude that (maybe surprisingly) Ex-cigarette smokers did NOT, on average, smoke less than current ones. The charts show that ex-smokers smoke more in both "Mean Estimated Number of Cigarettes Smoked per month" and "Mean Usual Quantity when Smoked Cigarettes" variables. The last chart shows no difference in the "Mean Usual Frequency when Smoked Cigarettes (per month)" variable between current and ex-smokers.
Коментари
Публикуване на коментар