Friday, March 28, 2008

SAS Clinical Trials

Clinical Trials Terminology for SAS Programmers
Clinical Trial Online – Running SAS on the Web without SAS/IntrNet
Managing Clinical Trials Data using SAS® Software
Clinical Trails
Quality Control and Quality Assurance in Clinical Research: SAS
Everything we should know about ICH, GCP and their Guidelines

Data Integrity through DEFINE.PDF and DEFINE.XML
SAS® and the CDISC (Clinical Data Interchange Standards Consortium)
An Introduction to CDISC:
CDISC: Why SAS® Programmers Need to Know
CDISC Implementation Step by Step: A Real World Example
CDISC standards
Supporting the CDISC standards
How to test CDISC Operation data Model (ODM) in SAS
Can Coding MedDRA and WHO Drug be as Easy as a Google Search?
The Use of CDISC Standards in SAS from Data Capture to Reporting
Clinical Data Model and FDA/CDISC Submissions
Implementing an Audit Trail within a Clinical Reporting Tool

Annotation of CRFs:
Trial eCRF Pages
Using SAS to Speed up Annotating Case Report Forms in PDF Format
ANNOTATED CASE REPORT FORM AUTOMATION SYSTEM
Annotated CRF 1: Download
(CTN0008_SDTM_annotation_20070413.pdf - 2179Kb)
Annotated CRF 2: Download
(CTN001_SDTM_ANNOTATION_20070330.pdf - 564Kb)
Annotated CRF 3: Download
(CTN002_SDTM_ANNOTATION_20070403.pdf - 560Kb)

Study Protocol 1: Download
(NIDA-CTN-0001_Bup_Nx_vs_Clonidine_Inpatient_Protocol_v.5b_112700.pdf - 192Kb)
SDTM-annotated CRFs
The CDISC ODM Study Designer :User Manual
Creating Case Report Tabulations (CRTs) for an NDA Electronic Submission to the FDA
XML Basics for SAS Programmers
A SAS MACRO FOR PRODUCING CLINICAL LABORATORY SHIFT TABLE
Some Statistical Programming Considerations for e-Submission

Saturday, January 26, 2008

SAS Programming

Base and Advanced SAS Programming:
Base SAS Certification Exam Model Questions:
SAS® Certification: An End User’s Review
How SAS thinks
SAS Procedures Guide
SAS Programming Skills
Affordable SAS tips
Advanced SAS Programming Techniques
Creative Uses of SAS Functions
Data Summarization Methods in Base SAS Procedures
The SAS Debugging Primer
SAS Model resumes and SAS Tips and Tricks
SAS UNIX commands:
SAS statements,Procedures and Functions
Introduction to merging in SAS
How MERGE Really Works
FIRST.variable and LAST.variables:
Proc Report and Proc Tabulate

Downloads

SAS eBOOKs
Base and Avanced SAS certification Materials and Practice
Exams...........

SAS Video Tutorials:
Class Notes
Entering Data, view movie
Exploring Data, view movie
Modifying Data, view movie
Managing Data, view movie
Analyzing Data, view movie (part 1) and movie (part 2)

Data step:

getting started 1: windows SAS code
getting started 2: data step SAS code
automatic _N_ variable SAS code
drop & delete SAS code
formating: dates and numbers SAS code date sal.txt (also see the format procedure below to create your own formats)functions SAS code
import: Bringing in data from Excel SAS code Excel import file Excel export file text file
input: length statement SAS code infile options.txtlong SAS code long.txt
missing data SAS code
output option SAS code
pointers SAS code ex7.txt ex8.txt ex9.txt
more about pointers SAS code pointers.SAS ex10.txt
missover & delimiter SAS code delimiter.txt
more on the delimiter SAS code
retain SAS code
set SAS code
simulations: random numbers SAS code
sum SAS code
statistical functions SAS code


Logic:
do loops SAS code
more about do loops SAS code
nested do loops SAS code
if then statements SAS code score.txt

Combining Data sets:

concatenating and interleaving SAS code
one-to-one merging SAS code
match merging SAS code
updating SAS code

Character functions:
substring function SAS code
trim and left functions SAS code
compress and index functions SAS code
record.txtindexc and indexw functions SAS code
implicit character-to-numeric conversion SAS code
explicit character-to-numeric conversion SAS code
implicit and explicit numeric-to-character conversion SAS code

Arrays:
introduction to arrays SAS code
using arrays to count SAS code
using arrays to order observations SAS code
using arrays to transpose data SAS code
ratsdose.txttwo dimensional arrays SAS code temp.txt fin.txt

Permanent SAS Data sets: (great for large data sets)
introduction: using libname SAS code
put and file statements SAS code survey.dat data1.dat data2.dat
data3.txt fruit.dat data4.dat data5.dat income.dat

Procedures:

ANOVA SAS code incommed.dat
analysis of equal vars: B-P for anova SAS code
contents: Great for large data sets SAS code
sheep.datcorrelation SAS code
import: Bringing in data from Excel SAS code
Excel import file Excel export file
text fileformat SAS code incommed.dat (also see formating above for SAS' date and number formats)
frequency tables SAS code incommed.dat freq.xls
means SAS code incommed.dat
more about means SAS code incommed.dat
gcharts: Bar and Pie charts SAS code incommed.dat
gplot: a prettier plot SAS code
more about gplot SAS code
plot SAS code
print SAS code account.txt
sort SAS code account.txt more about sorting SAS code
compt.txt t-test SAS code incommed.dat transpose SAS code
univariate SAS code test.dat

Programming outside the Data step or Procedures:

Getting started 3: options SAS codemore options SAS code

Macros:

introduction: macro variables (%let statement) SAS code
number.dat contest.dat
%put statement SAS code score.dat
basic macros SAS code
macros with parameters SAS code ranks.dat
macro do loops SAS code
macro if/then/else statements SAS code makeup.dat
nested macros SAS code
simulations example SAS code reg.dat

Online Study maerials:Fundamentals of Using SAS (part I)

Introduction to SAS
Descriptive information and statistics
An overview of statistical tests in SAS
Exploring data with graphics

Fundamentals of Using SAS (part II)

Using where with SAS procedures
Missing values in SAS
Common SAS options
Overview of SAS syntax of SAS procedures
Common error messages in SAS

Reading Raw Data into SAS

Inputting raw data into SAS
Reading dates into SAS and using date variables

Basic Data Management in SAS

Creating and recoding variables
Using SAS functions for making/recoding variables
Subsetting variables and observations
Labeling data, variables, and values
Using PROC SORT and the BY statement
Making and using permanent SAS data files (version 8)

Data Management:

How do I make unique anonymous ID variables for my data?
How can I create an enumeration variable by groups?
How can I see the number of missing values and patterns of missing values in my data file?
How can I count the number of missing values for a character variable?
How can I increment dates in SAS?How can I find things in a character variable in SAS?
How do I standardize variables (make them have a mean of 0 and sd of 1)?
Is there a quick way to create dummy variables?

Reading/Writing Data Files

How do I read a file that uses commas, tabs or spaces as delimiters to separate variables?
How do I read a delimited file with missing values?
How do I read a delimited file that has delimiters embedded in the data?
What are some common infile options for reading a raw data file?
How do I read raw data files compressed with gzip (.gz files) in SAS?
How do I write a data file that uses commas, tabs or spaces as delimiters between variables?How do I read/write Excel files in SAS version 8?

Reading/Writing SAS Files with Formats

How do I use a SAS data file with a format library?
How do I use a SAS data file when I don't have its format library?

Other:

How can I change the way variables are displayed in proc freq?
How can I put a value from a data file to a macro variable?
How can I create tables using proc tabulate?

Procedures

PROC MEANS More than just your average procedure(PDF) by Peter R. Welbrock

The power of PROC FORMAT(PDF) by Jonas V. Bilenas
Ten Things You Should Know About PROC FORMAT(PDF) by Jack Shoemaker
PROC SQL for DATA Step Die-Hards(PDF) by Christianna S. Williams
An Introduction to the SQL Procedure(PDF) by Chris Yindra
Alternatives to Merging SAS Data Sets … But Be Careful(PDF) by Michael J. Wieczkowski
Handling Missing Values in the SQL Procedure(PDF) by Danbo Yi & MA Lei Zhang
Creating and using indexes in SASCreating and using formats and format libraries in SASUsing multidimensional arraysGood Programming Practices
Bulletproofing Your SAS Results(PDF) by Vanessa Hayden
Clean-up, Comments and Code - Making it Maintainable(PDF) by Clay and Lori Martin
SAS Program Efficiency for Beginners(PDF) by Bruce Gilsen
Coding for Posterity(PDF) by Rick AsterOutput Delivery System(ODS)
ODS, YES! Odious, NO! – An Introduction to the SAS Output Delivery System(PDF) by Lara Bryant, Sally Muller & Ray Pass
ODS for Data Analysis: Output As-You-Like-It in Version 7(PDF) by Christopher R. Olinger and Randall D. Tobias, from SUGI Proceedings, 1998, courtesy of
SAS.Making the SAS Output Delivery System (ODS) work for you(PDF) by William Fehlner, from SUGI Proceedings, 1999, courtesy of SAS.Twisty Little Passages All Alike, Output Delivery System (ODS) Templates Exposed(PDF) by Chris Olinger, from SUGI Proceedings, 1999, courtesy of SAS.SAS Macros

Getting Started with Macros(PDF) by Ian Whitlock
Moving from Macro Variables to Macros(PDF) by Lisa Sanbonmatsu
Macros from Beginning to Mend A Simple and Practical Approach to the SAS Macro Facility(PDF) by Michael G. Sadof
An Introduction to Macro Variables and Macro Programs(PDF) by Mike S. Zdeb
Creating Macro Variables via PROC SQL(PDF) by Mike S. Zdeb
More About “INTO:Host-Variable” in PROC SQL: Examples(PDF) by John Q. Zhang
Macro Quoting Functions, Other Special Character Masking Tools, and How To Use Them(PDF) by Arthur L. Carpenter
Secrets of Macro Quoting Functions – How and Why(PDF) by Susan O’Connor
&&&, ;;, and Other Hieroglyphics Advanced Macro Topics(PDF) by Chris Yindra, C. YPROC SQL:
An Introduction to Proc SqlTop Ten Reasons to Use PROC SQL OtherDebugging 101(PDF) by Peter Knapp

Those Missing Values in Questionnaires(PDF) by John R. Gerlach & Cindy Garra
Avoiding Mayhem in the New Millennium: Working with Missing Data(PDF) by JoAnn Matthews
Simplifying Complex Character Comparisons by Using the IN Operator and the Colon (:) Operator Modifier(PDF) by Paul Grant

Arrays: In and Out and All About(PDF) by Marge Scerbo
Complex Arrays Made Simple(PDF) by Mary McDonald, PaineWebber Incorporated
You Could Look It Up: An Introduction to SASHELP Dictionary Views(PDF) by Michael Davis, The 'SKIP' Statement(PDF) by Paul Grant
Indexing and Compressing SAS Data Sets: How, Why, and Why Not(PDF) Andrew H. Karp,courtesy of NESUGSAS Online Documentation:SAS 9 DocumentationVersion 9.1.3

SAS OnlineDoc 9.1.3 for the WebProc GlimmixProc QuantregVersion 9.1.2SAS OnlineDoc 9.1.2 for the WebVersion 9.1SAS OnlineDoc 9.1 for the WebSAS OnlineDoc 9.1 in PDFClinical trials:Clinical Trials Terminology for SAS ProgrammersClinical Trial Online – Running SAS on the Web without SAS/IntrNetManaging Clinical Trials Data using SAS® SoftwareClinical Trails Trial eCRF PagesEverything we should know about ICH, GCP and their GuidelinesSAS® and the CDISC (Clinical Data Interchange Standards Consortium)

SUGI Papers:

SUGI Proceedings

for SUGI 22, 23, 24, 25, 26, 27SGF 2008 SGF 2007 SUGI 31 SUGI 30 SUGI 29 SUGI 28 SUGI 27 SUGI 26SUGI 25 SUGI 24 SUGI 23 SUGI 22 SUGI 21 SUGI 20 SUGI 13 SUGI 12 SUGI 09

SAS Forums/Groups:

Tek-Tip SAS ForumIndeedDBASpotSAS Programming ForumITIL Community Forum

SAS Interview Questions:Base SAS

· What SAS statements would you code to read an external raw data file to a DATA step?

INFILE statement.·

How do you read in the variables that you need?

Using Input statement with the column pointers like @5/12-17 etc.

· Are you familiar with special input delimiters? How are they used?

DLM and DSD are the delimiters that I’ve used. They should be included in the infile statement. Comma separated values files or CSV files are a common type of file that can be used to read with the DSD option. DSD option treats two delimiters in a row as MISSING value. DSD also ignores the delimiters enclosed in quotation marks.

· If reading a variable length file with fixed input, how would you prevent SAS from reading the next record if the last variable didn't have a value?

By using the option MISSOVER in the infile statement.If the input of some data lines are shorter than others then we use TRUNCOVER option in the infile statement.

· What is the difference between an informat and a format? Name three informats or formats.Informats read the data.

Format is to write the data.Informats: comma. dollar. date.Formats can be same as informatsInformats: MMDDYYw. DATEw. TIMEw. , PERCENTw,Formats: WORDIATE18., weekdatew.

Name and describe three SAS functions that you have used, if any?

LENGTH: returns the length of an argument not counting the trailing blanks.(missing values have a length of 1)
Ex: a=’my cat’;x=LENGTH(a);

Result: x=6…SUBSTR: SUBSTR(arg,position,n) extracts a substring from an argument starting at ‘position’ for ‘n’ characters or until end if no ‘n’.
Ex: A=’(916)734-6241’;X=SUBSTR(a,2,3);

RESULT: x=’916’TRIM: removes trailing blanks from character expression.
Ex: a=’my ‘; b=’cat’;X= TRIM(a)(b);
RESULT: x=’mycat’.
SUM: sum of non missing values.Ex: x=Sum(3,5,1);
result: x=9.0INT:
Returns the integer portion of the argument.

· How would you code the criteria to restrict the output to be produced?

Use NOPRINT option.

· What is the purpose of the trailing @ and the @@? How would you use them?

@ holds the value past the data step.@@ holds the value till a input statement or end of the line.Double trailing @@: When you have multiple observations per line of raw data, we should use double trailing signs (@@) at the end of the INPUT statement. The line hold specifies like a stop sign telling SAS, “stop, hold that line of raw data”.Trailing @: By using @ without specifying a column, it is as if you are telling SAS,” stay tuned for more information. Don’t touch that dial”. SAS will hold the line of data until it reaches either the end of the data step or an INPUT statement that does not end with the trailing.

· Under what circumstances would you code a SELECT construct instead of IF statements?

When you have a long series of mutually exclusive conditions and the comparison is numeric, using a SELECT group is slightly more efficient than using IF-THEN or IF-THEN-ELSE statements because CPU time is reduced.
SELECT GROUP:Select: begins with select group.When: identifies SAS statements that are executed when a particular condition is true.
Otherwise (optional): specifies a statement to be executed if no WHEN condition is met.End: ends a SELECT group.

·What statement you code to tell SAS that it is to write to an external file?
What statement do you code to write the record to the file?

PUT and FILE statements.

· If you're not wanting any SAS output from a data step, how would you code the data statement to prevent SAS from producing a set?

Data _Null_

· What is the one statement to set the criteria of data that can be coded in any step?

Options statement: This a part of SAS program and effects all steps that follow it.

· Have you ever linked SAS code? If so, describe the link and any required statements used to either process the code or the step itself.· How would you include common or reuse code to be processed along with your statements?

By using SAS Macros.

· When looking for data contained in a character string of 150 bytes, which function is the best to locate that data: scan, index, or indexc?

SCAN.

· If you have a data set that contains 100 variables, but you need only five of those, what is the code to force SAS to use only those variable?

Using KEEP option or statement.

· Code a PROC SORT on a data set containing State, District and County as the primary variables, along with several numeric variables.

Proc sort data=BY State District County ;
Run ;

· How would you delete duplicate observations?

NONUPLICATES

· How would you delete observations with duplicate keys?

NODUPKEY·

How would you code a merge that will keep only the observations that have matches from both sets.

Check the condition by using If statement in the Merge statement while merging datasets.

· How would you code a merge that will write the matches of both to one data set, the non-matches from the left-most data.

Step1: Define 3 datasets in DATA step
Step2: Assign values of IN statement to different variables for 2 datasets
Step3: Check for the condition using IF statement and output the matching to first dataset and no matches to different datasetsEx: data xxxmerge yyy(in = inxxx) zzz (in = inzzz);by aaa;if inxxx = 1 and inyyy = 1;run;

· What is the Program Data Vector (PDV)? What are its functions?

Function: To store the current obs;
PDV (Program Data Vector) is a logical area in memory where SAS creates a dataset one observation at a time. When SAS processes a data step it has two phases. Compilation phase and execution phase. During the compilation phase the input buffer is created to hold a record from external file. After input buffer is created the PDV is created. The PDV is the area of memory where SAS builds dataset, one observation at a time. The PDV contains two automatic variables _N_ and _ERROR_.

· Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.

SAS compiles the code· At compile time when a SAS data set is read, what items are created?Automatic variables are created. Input Buffer, PDV and Descriptor Information·

Name statements that are recognized at compile time only?

PUT·

Name statements that are execution only.

INFILE, INPUT·

Identify statements whose placement in the DATA step is critical.

DATA, INPUT, RUN.

· Name statements that function at both compile and execution time.

INPUT·

In the flow of DATA step processing, what is the first action in a typical DATA Step?

The DATA step begins with a DATA statement. Each time the DATA statement executes, a new iteration of the DATA step begins, and the _N_ automatic variable is incremented by 1.

· What is _n_?

It is a Data counter variable in SAS.
Note: Both -N- and _ERROR_ variables are always available to you in the data step.

–N- indicates the number of times SAS has looped through the data step.
This is not necessarily equal to the observation number, since a simple sub setting IF statement can change the relationship between Observation number and the number of iterations of the data step.

The –ERROR- variable ha a value of 1 if there is a error in the data for that observation and 0 if it is not. Ex: This is nothing but a implicit variable created by SAS during data processing. It gives the total number of records SAS has iterated in a dataset. It is Available only for data step and not for PROCS.

Eg. If we want to find every third record in a Dataset thenwe can use the _n_ as follows
Data new-sas-data-set;
Set old;
if mod(_n_,3)= 1 then;
run;

Note: If we use a where clause to subset the _n_ will not yield the required result.

SAS interview questions:General

Under what circumstances would you code a SELECT construct instead of IF statements?

A: I think Select statement are used when you are using one condition to compare with several conditions likeselect passwhen Physics >60when math > 100when English = 50;otherwise fail;

What is the one statement to set the criteria of data that can be codedin any step?

A) Options statement.

What is the effect of the OPTIONS statement ERRORS=1?

A) The –ERROR- variable ha a value of 1 if there is a error in the data for that observation and 0 if it is not.

What's the difference between VAR A1 - A4 and VAR A1 -- A4 ?

A: There is no diff between VAR A1-A4 an VAR A1—A4. Where as If u submit VAR A1---A4 instead of VAR A1-A4 or VAR A1—A3, u will see error message in the log.

What do the SAS log messages "numeric values have been converted to character" mean? What are the implications?

It implies that automatic conversion took place to make character functions possible

Why is a STOP statement needed for the POINT= option on a SET statement?

Because POINT= reads only the specified observations, SAS cannot detect an end-of-file condition as it would if the file were being read sequentially.

How do you control the number of observations and/or variables read or written?

FIRSTOBS and OBS optionApproximately

what date is represented by the SAS date value of 730?

31st December 1961

Identify statements whose placement in the DATA step is critical.

A: INPUT, DATA and RUN…

Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.

A) Compile

What does the RUN statement do?

a) When SAS editor looks at Run it starts compiling the data or proc step, if you have more than one data step or proc step or if you have a proc step Following the data step then you can avoid the usage of the run statement.

Why is SAS considered self-documenting?

A) SAS is considered self documenting because during the compilation time it creates and stores all the information about the data set like the time and date of the data set creation later No. of the variables later labels all that kind of info inside the dataset and you can look at that infousing proc contents procedure.

What are some good SAS programming practices for processing very large data sets?

A) Sort them once, can use firstobs = and obs = ,

What is the different between functions and PROCs that calculate the same simple descriptive statistics?

A)Functions can used inside the data step and on the same data set but with proc's you can create a new data sets to output the results. May be more ...........

If you were told to create many records from one record, show how youwould do this using arrays and with PROC TRANSPOSE?

A) I would use TRANSPOSE if the variables are less use arrays if the var are more ................. depends

What is a method for assigning first.VAR and last.VAR to the BY groupvariable on unsorted data?

A) In Unsorted data you can't use First. or Last.

How do you debug and test your SAS programs?

A) First thing is look into Log for errors or warning or NOTE in some cases or use the debugger in SAS data step.

What other SAS features do you use for error trapping and datavalidation?

A) Check the Log and for data validation things like Proc Freq, Proc means or some times proc print to look how the data looks like ........

How would you combine 3 or more tables with different structures?

A) I think sort them with common variables and use merge statement. I am not sure what you mean different structures.

What areas of SAS are you most interested in?

BASE, STAT, GRAPH, ETSBriefly describe 5 ways to do a "table lookup" in SAS.Match Merging, Direct Access, Format Tables, Arrays, PROC SQL

What versions of SAS have you used (on which platforms)?

SAS 8.2 in Windows and UNIX, SAS 7 and 6.12

What are some good SAS programming practices for processing very large data sets?

Sampling method using OBS option or subsetting, commenting the Lines, Use Data Null

What are some problems you might encounter in processing missing values?
In Data steps? Arithmetic? Comparisons? Functions? Classifying data?

The result of any operation with missing value will result in missing value. Most SAS statistical procedures exclude observations with any missing variable values from an analysis.

How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable?

Using PROC TRANSPOSE

What is the different between functions and PROCs that calculate the same simple descriptive statistics?

Proc can be used with wider scope and the results can be sent to a different dataset. Functions usually affect the existing datasets.

If you were told to create many records from one record,
show how you would do this using array and with PROC TRANSPOSE?

Declare array for number of variables in the record and then used Do loopProc Transpose with VAR statement.

What are _numeric_ and _character_ and what do they do?

Will either read or writes all numeric and character variables in dataset.

How would you create multiple observations from a single observation?

Using double Trailing @@

For what purpose would you use the RETAIN statement?

The retain statement is used to hold the values of variables across iterations of the data step. Normally, all variables in the data step are set to missing at the start of each iteration of the data step.

What is the order of evaluation of the comparison operators:

+ - * / ** ()?(), **, *, /, +, -

How could you generate test data with no input data?Using Data Null and put statementHow do you debug and test your SAS programs?

Using Obs=0 and systems options to trace the program execution in log.

What can you learn from the SAS log when debugging?

It will display the execution of whole program and the logic. It will also display the error with line number so that you can and edit the program.

What is the purpose of _error_?

It has only to values, which are 1 for error and 0 for no error

How can you put a "trace" in your program?

By using ODS TRACE ON

How does SAS handle missing values in: assignment statements, functions, a merge, an update, sort order, formats, PROCs?

Missing values will be assigned as missing in Assignment statement. Sort order treats missing as second smallest followed by underscore.

How do you test for missing values?

Using Subset functions like IF then Else,

Where and SelectHow are numeric and character missing values represented internally?

Character as Blank or “ and Numeric as.

Which date functions advances a date time or date/time value by a given interval?

INTNX.

In the flow of DATA step processing,
what is the first action in a typical DATA Step?

When you submit a DATA step, SAS processes the DATA step and then creates a new SAS data set.( creation of input buffer and PDV)Compilation PhaseExecution Phase

What are SAS/ACCESS and SAS/CONNECT?

SAS/Access only process through the databases like Oracle, SQL-server, Ms-Access etc. SAS/Connect only use Server connection.

What is the one statement to set the criteria of data that can be coded in any step?

OPTIONS Statement, Label statement, Keep / Drop statements.What is the purpose of using the N=PS option?The N=PS option creates a buffer in memory which is large enough to store PAGESIZE (PS) lines and enables a page to be formatted randomly prior to it being printed.

What are the scrubbing procedures in SAS?

Proc Sort with nodupkey option, because it will eliminate the duplicate values.

What are the new features included in the new version of SAS i.e., SAS9.1.3?

The main advantage of version 9 is faster execution of applications and centralized access of data and support.There are lots of changes has been made in the version 9 when we compared with the version 8.
The following are the few:SAS version 9 supports Formats longer than 8 bytes & is not possible with version 8.
Length for Numeric format allowed in version 9 is 32 where as 8 in version 8.
Length for Character names in version 9 is 31 where as in version 8 is 32.
Length for numeric informat in version 9 is 31, 8 in version 8.Length for character names is 30, 32 in version 8.3 new informats are available in version 9 to convert various date, time and datetime forms of data into a SAS date or SAS time.

·ANYDTDTEW. - Converts to a SAS date value ·ANYDTTMEW. - Converts to a SAS time value. ·ANYDTDTMW. -Converts to a SAS datetime value.CALL SYMPUTX Macro statement is added in the version 9 which creates a macro variable at execution time in the data step by ·Trimming trailing blanks · Automatically converting numeric value to character.

New ODS option (COLUMN OPTION) is included to create a multiple columns in the output.WHAT DIFFERRENCE DID YOU FIND AMONG VERSION 6 8 AND 9 OF SAS.

The SAS 9 Architecture is fundamentally different from any prior version of SAS. In the SAS 9 architecture, SAS relies on a new component,
the Metadata Server, to provide an information layer between the programs and the data they access.
Metadata, such as security permissions for SAS libraries and where the various SAS servers are running, are maintained in a common repository.

What has been your most common programming mistake?

Missing semicolon and not checking log after submitting program,
Not using debugging techniques and not using Fsview option vigorously.

Name several ways to achieve efficiency in your program. Explain trade-offs.

Efficiency and performance strategies can be classified into 5 different areas.
·CPU time
·Data Storage
· Elapsed time
· Input/Output
· Memory

CPU Time and Elapsed Time- Base line measurements Few Examples for efficiency violations: Retaining unwanted datasets Not sub setting early to eliminate unwanted records.

Efficiency improving techniques: Using KEEP and DROP statements to retain necessary variables. Use macros for reducing the code. Using IF-THEN/ELSE statements to process data programming. Use SQL procedure to reduce number of programming steps. Using of length statements to reduce the variable size for reducing the Data storage.Use of Data _NULL_ steps for processing null data sets for Data storage.

What other SAS products have you used and consider yourself proficient in using?

Data _NULL_ statement, Proc Means, Proc Report, Proc tabulate, Proc freq and Proc print, Proc Univariate etc.

What is the significance of the 'OF' in X=SUM (OF a1-a4, a6, a9);

If don’t use the OF function it might not be interpreted as we expect. For example the function above calculates the sum of a1 minus a4 plus a6 and a9 and not the whole sum of a1 to a4 & a6 and a9. It is true for mean option also.

What do the PUT and INPUT functions do?

INPUT function converts character data values to numeric values. PUT function converts numeric values to character values.
EX: for INPUT: INPUT (source, informat)For PUT: PUT (source, format)Note that INPUT function requires INFORMAT and PUT function requires FORMAT.
If we omit the INPUT or the PUT function during the data conversion, SAS will detect the mismatched variables and will try an automatic character-to-numeric or numeric-to-character conversion.
But sometimes this doesn’t work because $ sign prevents such conversion. Therefore it is always advisable to include INPUT and PUT functions in your programs when conversions occur.

Which date function advances a date, time or datetime value by a given interval?

INTNX: INTNX function advances a date, time, or datetime value by a given interval, and returns a date, time, or datetime value.

Ex: INTNX(interval,start-from,number-of-increments,alignment)
INTCK: INTCK(interval,start-of-period,end-of-period) is an interval functioncounts the number of intervals between two give SAS dates, Time and/or datetime. DATETIME () returns the current date and time of day. DATDIF (sdate,edate,basis): returns the number of days between two dates.

What do the MOD and INT function do? What do the PAD and DIM functions do?

MOD: Modulo is a constant or numeric variable, the function returns the reminder after numeric value divided by modulo.
INT: It returns the integer portion of a numeric value truncating the decimal portion.
PAD: it pads each record with blanks so that all data lines have the same length. It is used in the INFILE statement. It is useful only when missing data occurs at the end of the record.
CATX: concatenate character strings, removes leading and trailing blanks and inserts separators.
SCAN: it returns a specified word from a character value. Scan function assigns a length of 200 to each target variable.
SUBSTR: extracts a sub string and replaces character values.Extraction of a substring: Middleinitial=substr(middlename,1,1); Replacing character values: substr (phone,1,3)=’433’; If SUBSTR function is on the left side of a statement, the function replaces the contents of the character variable.TRIM: trims the trailing blanks from the character values.
SCAN vs. SUBSTR: SCAN extracts words within a value that is marked by delimiters. SUBSTR extracts a portion of the value by stating the specific location.

It is best used when we know the exact position of the sub string to extract from a character value.

How might you use MOD and INT on numeric to mimic SUBSTR on character Strings?

The first argument to the MOD function is a numeric, the second is a non-zero numeric; the result is the remainder when the integer quotient of argument-1 is divided by argument-2. The INT function takes only one argument and returns the integer portion of an argument, truncating the decimal portion.
Note that the argument can be an expression.
DATA NEW ;
A = 123456 ;
X = INT( A/1000 ) ;
Y = MOD( A, 1000 ) ;
Z = MOD( INT( A/100 ), 100 ) ;
PUT A= X= Y= Z= ;
RUN ;
A=123456
X=123
Y=456
Z=34

In ARRAY processing, what does the DIM function do?

DIM: It is used to return the number of elements in the array. When we use Dim function we would have to re –specify the stop value of an iterative DO statement if u change the dimension of the array.

How would you determine the number of missing or nonmissing values in computations?

To determine the number of missing values that are excluded in a computation, use the NMISS function.
data _null_;
m = . ;
y = 4 ;
z = 0 ;
N = N(m , y, z);
NMISS = NMISS (m , y, z);
run;

The above program results in N = 2 (Number of non missing values) and NMISS = 1 (number of missing values).

Do you need to know if there are any missing values?

Just use: missing_values=MISSING(field1,field2,field3); This function simply returns 0 if there aren't any or 1 if there are missing values.
If you need to know how many missing values you have then use num_missing=NMISS(field1,field2,field3);
You can also find the number of non-missing values with non_missing=N (field1,field2,field3);

What is the difference between: x=a+b+c+d; and x=SUM (of a, b, c ,d);?

Is anyone wondering why you wouldn’t just use total=field1+field2+field3;
First, how do you want missing values handled? The SUM function returns the sum of non-missing values. If you choose addition, you will get a missing value for the result if any of the fields are missing. Which one is appropriate depends upon your needs.

However, there is an advantage to use the SUM function even if you want the results to be missing. If you have more than a couple fields, you can often use shortcuts in writing the field names If your fields are not numbered sequentially but are stored in the program data vector together then you can use: total=SUM(of fielda--zfield); Just make sure you remember the “of” and the double dashes or your code will run but you won’t get your intended results. Mean is another function where the function will calculate differently than the writing out the formula if you have missing values.

There is a field containing a date. It needs to be displayed in the format "ddmonyy" if it's before 1975, "dd mon ccyy" if it's after 1985, and as 'Disco Years' if it's between 1975 and 1985. How would you accomplish this in data step code? Using only PROC FORMAT.

data new ;
input date ddmmyy10.;
cards;
01/05/1955
01/09/1970
01/12/1975
19/10/1979
25/10/1982
10/10/1988
27/12/1991
;
run;
proc format ;
value dat low-'01jan1975'd=ddmmyy10.
'01jan1975'd-'01JAN1985'd="Disco Years"
'01JAN1985'd-high=date9.;
run;
proc print;
format date dat. ;
run;

In the following DATA step, what is needed for 'fraction' to print to the log?

data _null_;
x=1/3;
if x=.3333 then put 'fraction';
run;

What is the difference between calculating the 'mean' using the mean function and PROC MEANS?

By default Proc Means calculate the summary statistics like N, Mean, Std deviation, Minimum and maximum, Where as Mean function compute only the mean values.

What are some differences between PROC SUMMARY and PROC MEANS?

Proc means by default give you the output in the output window and you can stop this by the option NOPRINT and can take the output in the separate file by the statement OUTPUTOUT= , But, proc summary doesn't give the default output, we have to explicitly give the output statement and then print the data by giving PRINT option to see the result.

What is a problem with merging two data sets that have variables with the same name but different data?

Understanding the basic algorithm of MERGE will help you understand how the stepProcesses. There are still a few common scenarios whose results sometimes catch users off guard.
Here are a few of the most frequent 'gotchas':
1- BY variables has different lengthsIt is possible to perform a MERGE when the lengths of the BY variables are different,
But if the data set with the shorter version is listed first on the MERGE statement,

theShorter length will be used for the length of the BY variable during the merge. Due to this shorter length, truncation occurs and unintended combinations could result.

In Version 8, a warning is issued to point out this data integrity risk. The warning will be issued regardless of which data set is listed first:

WARNING: Multiple lengths were specified for the BY variable name by input data sets.This may cause unexpected results. Truncation can be avoided by naming the data set with the longest length for the BY variable first on the MERGE statement, but the warning message is still issued. To prevent the warning, ensure the BY variables have the same length prior to combining them in the MERGE step with PROC CONTENTS.

You can change the variable length with either a LENGTH statement in the merge DATA step prior to the MERGE statement, or by recreating the data sets to have identical lengths for the BY variables.
Note: When doing MERGE we should not have MERGE and IF-THEN statement in one data step if the IF-THEN statement involves two variables that come from two different merging data sets.
If it is not completely clear when MERGE and IF-THEN can be used in one data step and when it should not be, then it is best to simply always separate them in different data step. By following the above recommendation, it will ensure an error-free merge result.

Which data set is the controlling data set in the MERGE statement?

Dataset having the less number of observations control the data set in the merge statement.

How do the IN= variables improve the capability of a MERGE?

The IN=variables
What if you want to keep in the output data set of a merge only the matches (only those observations to which both input data sets contribute)? SAS will set up for you special temporary variables,
called the "IN=" variables, so that you can do this and more. Here's what you have to do: signal to SAS on the MERGE statement that you need the IN= variables for the input data set(s) use the IN= variables in the data step appropriately, So to keep only the matches in the match-merge above, ask for the IN= variables and use them:
data three;
merge one(in=x) two(in=y);
/* x & y are your choices of names */
by id;
/* for the IN= variables for data */
if x=1 and y=1; /* sets one and two respectively */
run;

What techniques and/or PROCs do you use for tables?

Proc Freq, Proc univariate, Proc Tabulate & Proc Report.

Do you prefer PROC REPORT or PROC TABULATE? Why?

I prefer to use Proc report until I have to create cross tabulation tables, because, It gives me so many options to modify the look up of my table, (ex: Width option, by this we can change the width of each column in the table) Where as Proc tabulate unable to produce some of the things in my table. Ex: tabulate doesn’t produce n (%) in the desirable format.

How experienced are you with customized reporting and use of DATA _NULL_ features?

I have very good experience in creating customized reports as well as with Data _NULL_ step. It’s a Data step that generates a report without creating the dataset there by development time can be saved. The other advantages of Data NULL is when we submit, if there is any compilation error is there in the statement which can be detected and written to the log there by error can be detected by checking the log after submitting it. It is also used to create the macro variables in the data set.

What is the difference between nodup and nodupkey options?

NODUP compares all the variables in our dataset while NODUPKEY compares just the BY variables.

What is the difference between compiler and interpreter? Give any one example (software product) that act as an interpreter?

Both are similar as they achieve similar purposes, but inherently different as to how they achieve that purpose. The interpreter translates instructions one at a time, and then executes those instructions immediately. Compiled code takes programs (source) written in SAS programming language, and then ultimately translates it into object code or machine language. Compiled code does the work much more efficiently, because it produces a complete machine language program, which can then be executed.

Code the table’s statement for a single level frequency?

Proc freq data=lib.dataset;
table var;
*here you can mention single variable of multiple variables seperated by space to get single frequency;
run;

What is the main difference between rename and label?

1. Label is global and rename is local i.e., label statement can be used either in proc or data step where as rename should be used only in data step.
2.If we rename a variable, old name will be lost but if we label a variable its short name (old name) exists along with its descriptive name.

What is Enterprise Guide? What is the use of it?

It is an approach to import text files with SAS (It comes free with Base SAS version 9.0)

What other SAS features do you use for error trapping and data validation? What are the validation tools in SAS?

For dataset: Data set name/debug
Data set: name/stmtchk
For macros: Options:mprint mlogic symbolgen.How can you put a "trace" in your program?ODS Trace ON, ODS Trace OFF the trace records.

How would you code a merge that will keep only the observations that have matches from both data sets?

Using "IN" variable option. Look at the following example.
data three;
merge one(in=x) two(in=y);
by id;
if x=1 and y=1;
run;
or
data three;
merge one(in=x) two(in=y);
by id;
if x and y;
run;

What are input dataset and output dataset options?

Input data set options are obs, firstobs, where, in output data set options compress, reuse.Both input and output dataset options include keep, drop, rename, obs, first obs.

How can u create zero observation dataset?

Creating a data set by using the like clause.
ex:
proc sql;
create table latha.emp like oracle.emp;
quit;

In this the like clause triggers the existing table structure to be copied to the new table. using this method result in the creation of an empty table.

Have you ever-linked SAS code, If so, describe the link and any required statements used to either process the code or the step itself?

In the editor window we write
%include 'path of the sas file';
run;
if it is with non-windowing environment no need to give run statement.

How can u import .CSV file in to SAS? tell Syntax?

To create CSV file, we have to open notepad, then, declare the variables.
proc import datafile='E:\age.csv'out=sarathdbms=csv replace;getnames=yes;
proc print data=sarath;
run;

What is the use of Proc SQl?

PROC SQL is a powerful tool in SAS, which combines the functionality of data and proc steps. PROC SQL can sort, summarize, subset, join (merge), and concatenate datasets, create new variables, and print the results or create a new dataset all in one step! PROC SQL uses fewer resources when compard to that of data and proc steps. To join files in PROC SQL it does not require to sort the data prior to merging, which is must, is data merge.

What is SAS GRAPH?

SAS/GRAPH software creates and delivers accurate, high-impact visuals that enable decision makers to gain a quick understanding of critical business issues.

Why is a STOP statement needed for the point=option on a SET statement?

When you use the POINT= option, you must include a STOP statement to stop DATA step processing, programming logic that checks for an invalid value of the POINT= variable, or Both. Because POINT= reads only those observations that are specified in the DO statement, SAS cannot read an end-of-file indicator as it would if the file were being read sequentially. Because reading an end-of-file indicator ends a DATA step automatically, failure to substitute another means of ending the DATA step when you use POINT= can cause the DATA step to go into a continuous loop.

What is the difference between nodup and nodupkey options?

The NODUP option checks for and eliminates duplicate observations. The NODUPKEY option checks for and eliminates duplicate observations by variable values.

SAS interview Q & A: PROC SQl and SAS GRAPH and ODS

PROC SQL:

1. What are the three types of join?

A. The three types of join are inner join, left join and right join.The inner join option takes the matching values from both the tables by the ON option. The left join selects all the variables from the first table and joins second table to it. The right join selects all the variables of table b first and join the table a to it.

2. Have you ever used PROC SQL for data summarization?

A. Yes I have used it for summarization at times…For e.g if I have to calculate the max value of BP for patients 101 102 and 103 then I use the max (bpd) function to get the maximum value and use group by statement to group the patients accordingly.

3. Tell me about your SQL experience?

A. I have used the SAS/ACCESS SQL pass thru facility for connection with external databases and importing tables from them and also Microsoft access and excel files.Besides this, lot of times I have used PROC SQL for joining tables.

4. Once you have had the data read into SAS datasets are you more of a data step programmer or a PROC SQL programmer?

A. It depends on what types of analysis datasets are required for creating tables but I am more of a data step programmer as it gives me more flexibility.For e.g creating a change from baseline data set for blood pressure sometimes I have to retain certain values …use arrays ….or use the first. -and last. variables.

5. What types of programming tasks do you use PROC SQL for versus the data step?

A. Proc SQL is very convenient for performing table joins compared to a data step merge as it does not require the key columns to be sorted prior to join. A data step is more suitable for sequential observation-by-observation processing.PROC SQL can save a great deal of time if u want to filter the variables while selecting or u can modify them …apply format….creating new variables , macrovariables…as well as subsetting the data.PROC SQL offers great flexibility for joining tables.

6. Have u ever used PROC SQL to read in a raw data file?

A. No. I don’t think it can be used.

7. How do you merge data in Proc SQL?

The three types of join are inner join, left join and right join. The inner join option takes the matching values from both the tables by the ON option. The left join selects all the variables from the first table and joins second table to it. The right join selects all the variables of table b first and join the table a to it.
PROC SQL;
CREATE TABLE BOTH AS
SELECT A.PATIENT,
A.DATE FORMAT=DATE7. AS DATE,
A.PULSE,B.MED, B.DOSES,
B.AMT FORMAT=4.1
FROM VITALS A INNER JOIN DOSING B
ON (A.PATIENT = B.PATIENT)
AND(A.DATE = B.DATE)
ORDER BY PATIENT, DATE;
QUIT;

8. What are the statements in Proc SQl?

Select, From, Where, Group By, Having, Order.
PROC SQL;
CREATE TABLE HIGHBPP2 AS
SELECT PATIENT,
COUNT (PATIENT) AS N,
DATE FORMAT=DATE7.,
MAX(BPD) AS BPDHIGH
FROM VITALS
WHERE PATIENT IN (101 102 103)
GROUP BY PATIENT
HAVING BPD = CALCULATED BPDHIGH
ORDER BY CALCULATED BPDHIGH;
Quit;

9. Why and when do you use Proc SQl?

Proc SQL is very convenient for performing table joins compared to a data step merge as it does not require the key columns to be sorted prior to join. A data step is more suitable for sequential observation-by-observation processing.
PROC SQL can save a great deal of time if u want to filter the variables while selecting or we can modify them, apply format and creating new variables, macrovariables…as well as subsetting the data. PROC SQL offers great flexibility for joining tables.

SAS GRAPH:

1. What type of graphs have you have generated using SAS?

A. I have used Proc GPLOT where I have created change from baseline scatter plots. I have also used Proc LIFETEST to create Kaplan-Meier survival estimates plots for survival analysis to determine which treatment displays better time-to-event distribution.

2. Have you ever used GREPLAY?

A. YES, I have used the PROC GREPLAY point and click interface to integrate 4 graphs in one page. Which were produced by the reg procedure.

3. What is the symbol statement used for?

A. Symbol statement is used for placing symbols in the graphics output.Associated variables can specify the color, font and heights of the symbols displayed.

4. Have you ever used the annotate facility? What is a situation when you have had to use the ANNOTATE facility in the past?

A. Yes, I have used the annotate facility for graphs. I have used the annotate facility to position labels in the Kaplan-Meier survival estimates, where I had to specify the function as ‘label’ and give the x and y co-ordinates and the position where this label is to be placed.

ODS (OUTPUT DELIVERY SYSTEM):

1. What are all the ODS procedure have u encountered?

Tracing and selecting the procedure Output;
ODS Trace on;
Proc steps…;
Run;
ODS Trace off;
ODS Select statement,Proc steps…;
ODS Select output-object-list;
Run;
ODS Output statement,ODS output output-object= new SAS dataset;
ODS html body = “path\marinebody.html”Contents = “path\marineTOC.html”Page = “ path\marinepage.html”Frame= “path\marineframe.html”;…..
ODS html close;
ODS rtf file = “filename.rtf” options;
Options like columns=n, bodytitle, SASdate and style.ODS rtf close;
SimilarlyODS Pdf file = “filename.pdf” options; ……..
ODS pdf close;

2. What is your experience with ODS?

A. I have used ODS for creating files output formats RTF HTML and PDF as per the requirement of my manager. HTML files could be posted on the web site for viewing or can also be imported into word processors.ODS HTML body = ‘path’Contents= ‘path’Frame = ‘path’ODS HTML close;ODS RTF FILE = ‘path’ODS RTF close; When we create RTF output we can copy it into word document and edit and resize it like word tables.

3. What does the trace option do?

A. ODS Trace is used to find the names of the particular output objects when several of them are created by some procedure.
ODS TRACE ON;
ODS TRACE OFF;

SAS UNIX:

PROC SQL:

1. What are the three types of join?

A. The three types of join are inner join, left join and right join.The inner join option takes the matching values from both the tables by the ON option. The left join selects all the variables from the first table and joins second table to it. The right join selects all the variables of table b first and join the table a to it.

2. Have you ever used PROC SQL for data summarization?

A. Yes I have used it for summarization at times…For e.g if I have to calculate the max value of BP for patients 101 102 and 103 then I use the max (bpd) function to get the maximum value and use group by statement to group the patients accordingly.

3. Tell me about your SQL experience?

A. I have used the SAS/ACCESS SQL pass thru facility for connection with external databases and importing tables from them and also Microsoft access and excel files.Besides this, lot of times I have used PROC SQL for joining tables.

4. Once you have had the data read into SAS datasets are you more of a data step programmer or a PROC SQL programmer?

A. It depends on what types of analysis datasets are required for creating tables but I am more of a data step programmer as it gives me more flexibility.For e.g creating a change from baseline data set for blood pressure sometimes I have to retain certain values …use arrays ….or use the first. -and last. variables.

5. What types of programming tasks do you use PROC SQL for versus the data step?

A. Proc SQL is very convenient for performing table joins compared to a data step merge as it does not require the key columns to be sorted prior to join. A data step is more suitable for sequential observation-by-observation processing.PROC SQL can save a great deal of time if u want to filter the variables while selecting or u can modify them …apply format….creating new variables , macrovariables…as well as subsetting the data.PROC SQL offers great flexibility for joining tables.

6. Have u ever used PROC SQL to read in a raw data file?

A. No. I don’t think it can be used.

7. How do you merge data in Proc SQL?

The three types of join are inner join, left join and right join. The inner join option takes the matching values from both the tables by the ON option. The left join selects all the variables from the first table and joins second table to it. The right join selects all the variables of table b first and join the table a to it.
PROC SQL;
CREATE TABLE BOTH AS
SELECT A.PATIENT,A.DATE FORMAT=DATE7.
AS DATE, A.PULSE,B.MED, B.DOSES, B.AMT FORMAT=4.1
FROM
VITALS A INNER JOIN DOSING BON (A.PATIENT = B.PATIENT)
AND(A.DATE = B.DATE)
ORDER BY PATIENT, DATE;
QUIT;

8. What are the statements in Proc SQl?

Select, From, Where, Group By, Having, Order.
PROC SQL;
CREATE TABLE HIGHBPP2 AS
SELECT PATIENT, COUNT (PATIENT) AS N,
DATE FORMAT=DATE7.,
MAX(BPD) AS BPDHIGH
FROM VITALS
WHERE PATIENT IN (101 102 103)
GROUP BY PATIENTHAVING BPD = CALCULATED BPDHIGH
ORDER BY CALCULATED BPDHIGH;
Quit;

9. Why and when do you use Proc SQl?

Proc SQL is very convenient for performing table joins compared to a data step merge as it does not require the key columns to be sorted prior to join. A data step is more suitable for sequential observation-by-observation processing.PROC SQL can save a great deal of time if u want to filter the variables while selecting or we can modify them, apply format and creating new variables, macrovariables…as well as subsetting the data. PROC SQL offers great flexibility for joining tables.

SAS GRAPH:

1. What type of graphs have you have generated using SAS?

A. I have used Proc GPLOT where I have created change from baseline scatter plots. I have also used Proc LIFETEST to create Kaplan-Meier survival estimates plots for survival analysis to determine which treatment displays better time-to-event distribution.

2. Have you ever used GREPLAY?

A. YES, I have used the PROC GREPLAY point and click interface to integrate 4 graphs in one page. Which were produced by the reg procedure.

3. What is the symbol statement used for?

A. Symbol statement is used for placing symbols in the graphics output.Associated variables can specify the color, font and heights of the symbols displayed.

4. Have you ever used the annotate facility? What is a situation when you have had to use the ANNOTATE facility in the past?

A. Yes, I have used the annotate facility for graphs. I have used the annotate facility to position labels in the Kaplan-Meier survival estimates, where I had to specify the function as ‘label’ and give the x and y co-ordinates and the position where this label is to be placed.

ODS (OUTPUT DELIVERY SYSTEM):

1. What are all the ODS procedure have u encountered?

Tracing and selecting the procedure Output;
ODS Trace on;
Proc steps…;
Run;
ODS Trace off;
ODS Select statement,Proc steps…;
ODS Select output-object-list;
Run;
ODS Output statement,ODS output output-object= new SAS dataset;
ODS html body = “path\marinebody.html”Contents = “path\marineTOC.html”Page = “ path\marinepage.html”Frame= “path\marineframe.html”;…..
ODS html close;
ODS rtf file = “filename.rtf” options;
Options like columns=n, bodytitle, SASdate and style.ODS rtf close;
SimilarlyODS Pdf file = “filename.pdf” options; ……..
ODS pdf close;

2. What is your experience with ODS?

A. I have used ODS for creating files output formats RTF HTML and PDF as per the requirement of my manager. HTML files could be posted on the web site for viewing or can also be imported into word processors.ODS HTML body = ‘path’Contents= ‘path’Frame = ‘path’ODS HTML close;ODS RTF FILE = ‘path’ODS RTF close; When we create RTF output we can copy it into word document and edit and resize it like word tables.

3. What does the trace option do?

A. ODS Trace is used to find the names of the particular output objects when several of them are created by some procedure.
ODS TRACE ON;
ODS TRACE OFF;

SAS UNIX:

1. Unix environment?

SAS can effectively be used with Unix operating system. We have some options that would let the programmer to extract files from the terminal as well as save the Output to the terminal.

2. When would you use UNIX instead of PC SAS?

When we need to submit the program in a batch/ non-interactive mode.When we are concerned with security issues.

3. What operating systems can you sit down at today and be productive?

I can be productive on Unix and Windows system.

4. Are you comfortable at the command line?

A. Yes, I am comfortable at the command line mode.I can write commands like for listing all files (-a), Listing directory itself (-d) and ls, ls-l privileges.

5. Setting permissions?

A. r Read permission
w Write permission
x Execute permission
- no permission
Change permissions on file
Chmod [options] file
Chmod u + w file [gives the user (owner) write permission]
Chmod g + r file [gives the group read permission]
Chmod o – x file [removes execute permission for others]

6. Can you write shell scripts?

A. Yes, I can write shell scripts. I have used the VI editor (multiple commands) (n editor, single line) to write shell scripts (which is a group of commands to be executed at once).Command VI opens the editor.(Escape..colon wq) for saving any file and will quit the editor. For executing the shell scripts we write the filename.shI have written a shell script to match a user id to a person’s name.

7. Do you know how to use the VI editor?

A. It uses standard alphabetic keys for command. We can create a newfile by this command. VI ‘filename’.In command mode, the letters of the keyboard perform editing functions (like moving the cursor, deleting text, etc.). To enter command mode, press the escape key SAS can effectively be used with Unix operating system. We have some options that would let the programmer to extract files from the terminal as well as save the Output to the terminal.

SAS Model resumes and SAS Tips and Tricks


Leave Ur Ids for SAS Clinical trials resumes.
SAS tips:http://www.sastips.com/ http://www.asu.edu/it/fyi/dst/helpdocs/statistics/sas/tips/index.html
SAS Hints
* Delete temporary filesUse proc datasets to delete temporary files created by SAS.
data name;
set name;
run;
SAS code ...proc datasets nolist;
delete name’run;

* Determine if a file exists
Use %sysexec to spawn a shell
%sysexec /bin/ksh -c "if [ ! -a /etc/passwd ]; then exit 99; else
exit 0; fi;";
%put &sysrc;
orUse filename pipe%let thefile=/etc/passwd;
filename testit pipe "if [ ! -f &thefile ]; then echo 'no';
else echo 'yes';
fi";data _null_;
infile testit pad missover lrecl=3;
input answer $3.;
put answer=;
run;

* Direct output to different directory
Direct SAS output to a directory (other than the one where SAS is being run)
sas –work /different_directory program-name
* Errors (specify # of errors for SAS to put into log)Specify the number of errors for SAS to list in the log file in the options line (useful to obtain list of errors, i.e., missing values for markers)
options ls=65 ps=55 pageno=1 errors=40
* Execute Unix command from SAS
Single Unix commands can be executed by
X command;orcall system (‘command’);
(call system can only be used inside a data step)
Multiple Unix commands can be executed byX ‘command 1; command 2; … command n’;
or
call system (‘command1; command2; … command n’);
%sysexec macro can also be used (macro test will execute Unix commands ‘pwd’ and ‘ls –l’):
%macro test%sysexec %str(pwd; ls –l);
%mend test;
%test;X;
(no Unix commands) starts a shell. The user can run programs, check output, etc., then type ‘exit’ to return to SAS.

* Generate delimited file with no spaces between columns
data _NULL_;
set sasdataset;
file "output.txt";
(or "output.csv")newvar=compress(var1DELIMvar2);
put newvar;
run;
DELIM is comma , or tab ‘09’x

* Input – Group variables and informats
Informat lists can be grouped when input values are arranged in a pattern. A group informat list consists of 2 lists – the names of the variables to read enclosed in parentheses and the corresponding informats separated by either blanks or commas and enclosed in parentheses.
If values for the 5 variables SCORE1 through SCORE5 are stored as four columns per value without intervening blanks the input statement is
input (score1-score5) (4.);January 23, 2007 2
The +1 column pointer moves the pointer forward one column after X is read. X is read with the 2. informat and the pointer moves to Y, which is read with the 2. informat.
The pointer then moves to Z, which is read with the 2. informat.
data test;
input (x y z) (2.,+1);
datalines;
2 24 36
0 20 30;
The n* modifier can be used to specify the number of times to repeat the next informat.
input (name score1-score5) ($10. 5*4.); score will be read in 5 times

* Noterminal option for PROC EXPORT or PROC IMPORT
The message "Error: Cannot open X display" may be received when using PROC IMPORT or PROC EXPORT in batch mode on Unix systems without a terminal present. This can be avoided by using the –noterminal option.sas program-name -noterminal

* Provide timing and memory usage information
fullstimer collects performance statistcs on each SAS job step and for the job as a whole and puts them in the log file. Note: measures are a snapshot view of performance at step and job level; each SAS port yields different fullstimer statistics based on the host operating system.
sas –fullstimer program-name
Sample result of SAS data step
NOTE: DATA statement used:
real time 0.06 seconds
user cpu time 0.02 seconds
system cpu time 0.00 seconds
Memory 88k
Page Faults 10
Page Reclaims 0
Page Swaps 0
Voluntary Context Switches 22
Involuntary Context Switches 0
Block Input Operations 10
Block Output Operations 12

* Read in compressed data file using zcat
The environment variable ‘$HOME’ is needed (zcat/gunzip uses it in the shell).
filename thedata pipe 'zcat $HOME/datafile.Z';
OR
filename thedata pipe ‘gunzip –c $HOME/datafile.gz’ for gzipped fileddata _null_;
infile thedata pad lrecl=40;
input theline $80.;
put theline=;
run;
run;

* Read in space/tab delimited filesdata name;
infile use dsd delimiter=" " firstobs=2 truncover;
input heading1 :$10 heading2 $ heading 3 ……;
run;
dlm can be used in place of delimiterTab delimited files:
use dlm=’09’x
Comma delimited files: use dlm=","
firstobs=2 tells SAS to start reading at line 2 (skips header line):$10 – colon tells SAS to read up to the number of characters specified or to the next delimiter, whichever comes first

* Return the ID of a user in Unix environment
%let person=%sysget(USER);
%put User is &person;
SAS log: User is uniqname
* Return the value of a specified operating environment variable%let homedir=%sysget(HOME); (HOME must be in capital letters)data _null_;
put "The value of my HOME environment variable is: &homedir";
run;
SAS log: The value of my HOME environment variable is: /afs/sph.umich.edu/user/uniqname

* Rounding in SAS
CEIL (ceiling) function (rounds up)Returns the smallest integer
greater than or equal to the argument
data a;
x=2.102;
y=ceil(x);
put x= y=;
run;
Results in: x=2.102 y=3
Floor function (rounds down)
Returns the smallest integer less than or equal to the argument
data a;
x=2.102;y=floor(x);
put x= y=;
run;
Results in: x=2.102 y=2

* Specify a character string to pass to SAS programs using sysparm
sas program-name -sysparm dept.projects (sysparm supplies name of data set for proc report)
proc report data=&sysparm report=test.resorces.priority.rept;
title "%sysfunc(date(),worddate.)";
title2 'Active Projects By Priority';
run;
SAS sees:
proc report data=dept.projects report=test.resorces.priority.rept;
title "Today’s Date";
title2 'Active Projects By Priority';
run;
or
sas height.sas –sysparm cm
data height;
set height;
if sysparm() = ‘cm’ then do;
height=height*2.54;
unit=‘centimeters’;
end;
run;

*SAS output (data height contains name, height, and inches as unit)
NAME HEIGHT UNIT
John 175.26 centimeters
Sally 162.56 centimeters
Peter 190.50 centimeters
* Use SAS with pipes or as a filter under Unix (writing stdout to stdin without using an intermediate file.
* Using X commands in Solaris vs. Linux
Solaris – run X program, redirect STDOUT to file, and read file into
SAS dataset:
%let xcomm="getrand -n &numobs > rand.dat";
X &xcomm;
data new;
infile "rand.dat";
input ranu;
run;
Linux – this will only work with the pipe option
%let xcomm="getrand -n &numobs";
filename fromunix pipe &xcomm;
data new;
infile fromunix;
input rannum;
run;

Friday, January 25, 2008

SAS® and the CDISC (Clinical Data Interchange Standards Consortium)

CDISC

Consortium of Data Interchange Standards Committee (CDISC) is primarily concerned withdeveloping standards that aid in the exchange of information between companies in the BioPharmaecosystems.

These include the following models:
• Operational Data Model (ODM) —operational support of data collection
• Study Data Tabulation Model (SDTM) —data tabulation data sets
• Case Report Tabulation Data Definition Specification (CRTDDS - aka define.xml)• Laboratory Data Model (Lab)
• Standard for Exchange of Non-clinical Data (SEND)
• BRIDG—Protocol Representation• Analysis Data Model (ADaM) —analysis data structures
• And others… (For example, LAB, SEND)Taken together, these standards and guidelines represent challenges of supporting the clinical researchprocess.

The importance of data standards Data standards are a critical component in the quest to improve global public health. Inefficiencies in the collection, processing and analysis of patient and health-related information drive up the cost of research and development for life sciences companies as well as negatively impact the cost and quality of healthcare delivery for patients and consumers.

SAS software support for CDISC standards In addition to helping define CDISC standards, SAS is making certain that our products and solutions support the implementation of CDISC data standards. SAS®9 includes a component called PROC CDISC that enables organizations running SAS programs to work with CDISC structured data. PROC CDISC supports bi-directional conversion of data content contained in a CDISC ODM XML document to and from SAS-accessible data sources. The current version of PROC CDISC also supports content validation of SAS-accessible data sources to the CDISC SDTM data domain definitions. See http://www.cdisc.org/ for details on individual format descriptions.

CDISC standards such as SDTM, ODM, LAB and ADaM can be effectively implemented in solutions like SAS Drug Development and SAS DI Studio, and we're currently exploring additional ways that these standard processes and data structures can be utilized within our software.

The SAS XML Libname Engine has been enhanced in SAS 9.1.3 to natively read and write CDISC ODM file content. Using the SAS XML Libname Engine, any data content accessible to SAS may be converted to a CDISC ODM XML document, or conversely, any content in a CDISC ODM XML document may be converted to a SAS dataset or other SAS-accessible data source.

SAS CDISC implementation services In addition to providing CDISC support within our
software, SAS consultants are ready to help your organization implement CDISC standards to drive efficiencies in your clinical development processes.

Glossary

AdaM: Analysis Dataset Model
CDISC: Clinical Data Interchange Standards Consortium
CRT-DDS: Case Report Tabulation Data Definition Specification
LAB: Laboratory Data Model
ODM: Operational Data Model
SDS: Submission Data Standards
SDTM: Study Data Tabulation Model
XML: eXtensible Markup Language