Data Management and Visualization: Running my first program

 The program I wrote this afternoon is posted bellow:

The outputs of my variables are as follows:

- The output of "SMOKER" variable, indicating Tobacco use Status with possible values: Current user=1; Ex-user=2; Lifetime nonsmoker=3;






- The output of "S3AQ3B1" variable, indicating Usual Frequency when Smoked Cigarettes with possible values: Every day=1; 5 to 6 Day(s) a week=2; 3 to 4 Day(s) a week=3; 1 to 2 Day(s) a week=4; 2 to 3 Day(s) a month=5; Once a month or less=6; Unknown=9;










- I have added a percentage of missing values in variable "S3AQ3C1" - Usual Quantity when Smoked Cigarettes, shown bellow:



- And the percentage of Unknown values in variable "S3AQ3D1R" - Duration (Days) of Usual Cigarette Smoking:




The last two variables are interesting because they leave a very small percentage of the initial dataset completely filled with the necessary data - quantity of smoked cigarettes and duration of usual cigarette smoking. The number of records which don't have an Unknown or Missing value for the upper two variables is very small, as shown bellow (only 2930 records from the initial 43093):



Коментари

Популярни публикации