GP01 Dx GP10 for each patient where:

I like the simplification

Below is an example of a typical request from a PRC client. This task will provide us with a mechanism for evaluating your programming expertise, as well as acquainting you with the nature of our work.

You are given 3 existing SAS data sets: Patient Demographic, Prescription, and Medical.

Patient Demographic Variables:
patient_id [numeric]
sex {‘M’,’F’,’U’}
race {0,1,2,3,4,5}
birthdt [character (8) mmddyyyy format]

Prescription Variables:
patient_id [numeric]
fill_dt [character (6) yymmdd format]
pharmacyid [numeric]
drugcode [numeric]
pills [numeric]

Medical Variables:
patientid [numeric]
servicedate [character (6) yymmdd format],
providerid [numeric]
source [character (3) format ‘aaa’]
servicecode [character (5) format ‘annnn’]
diagnosiscode1 [character (3) format ‘nnn’]
diagnosiscode2 [character (3) format ‘nnn’]
diagnosiscode3 [character (3) format ‘nnn’]

0. There is one personal summary record per patient.
0. There are multiple prescription records per patient.
0. There are multiple medical records per patient.

Write SAS code with the use of data step(s) and procedures that will characterize drug use and physician diagnoses for teenage patients (age 13-18 years as of January 1, 2001).

· Subset the medical and prescription records to those that occurred during the teen years.
· Subset the medical file to physician visit records (identified by the variable source=‘phy’).
· Create 10 new diagnosis group summary flag variables Dx_GP01-Dx_GP10 for each patient where:
Dx_GP01 = 1 if patient has any diagnosis code value from 000-100 inclusive, otherwise 0,
Dx_GP02 = 1 if patient has any diagnosis code value from 101-200 inclusive, otherwise 0,
Dx_GP10 = 1 if patient has any diagnosis code value from 901-999 inclusive, otherwise 0.
· Generate descriptive statistics for each of the patient demographic variables.
· Calculate and report the average age across all patients.
· Generate a report on the proportion of: patients on any drug, patients with any physician visit, and patients on any drug or with any physician visit.
· Report the distribution of: # of physician visits per patient, # of patients per drugcode, # of patients per diagnostic grouping, and # of prescriptions per drugcode.

Additional Info:
· Write the program in such a way that you would expect it to run on a real data set containing two years of prescription and medical data for 500,000 patients.
· Document as many assumptions as possible through comments within the program(s).