/*Course 1 Week 2 First SAS program*/
LIBNAME mydata "/courses/d1406ae5ba27fe300 " access=readonly;
DATA new; set mydata.gapminder;
keep incomeperperson Alcconsumption Femaleemployrate Lifeexpectancy polityscore urbanrate suicideper100TH co2emissions country;
DATA new; set mydata.gapminder;
LABEL incomeperperson ="Income per person" Alcconsumption ="Alcohol consum" Femaleemployrate ="Female employment" Lifeexpectancy ="Life expectancy at birth" polityscore ="Democracy score" urbanrate ="Urban population %" suicideper100TH ="Suicide/100000 age adjusted" co2emissions ="Total amount of co2 emissions";
PROC FREQ; TABLES incomeperperson Alcconsumption Femaleemployrate Lifeexpectancy polityscore urbanrate suicideper100TH;
--> I only show one of the frequency procedures because, in my research question, they don’t offer any interesting results. Variables like income, employment rate, urbanrate, etc., take a lot of different values (significant variation). Frequency tables are more useful for dummy variables where I see the percentage of ex. positive or negative answers. There are no errors in the program and no missing data. There are 213 observations, like in the original CSV file (=213 countries). From the gapminder database, I have kept only the variables I am interested in. Find in the next post the updated Codebook.