Sunday, April 22, 2012

SAS & Excel Via Local Provider

A question came up on SAS-L for how to get Excel to read SAS datasets using the Data Sources within Excel. Here are some screenshots showing how it works:



















Wednesday, April 18, 2012

Hash a SAS Value

Sometimes, it is good to be able to hash a value so that a unique key can be made into the data. For example, say you were looking at a system performance log. You have a PID, a process name, and a user. PIDs are reused by a system all of the time so trying to narrow down uniqueness throughout a day is hard.

It order to get a unique value, you could concatenate the values into one:

000789654 || WeeklyProcess || gertre5

We are assuming that there is no need to ever reverse the values. This is a key assumption.

There is an undocumented function in SAS called CRCXX1 that can create a unqiue hash. Here is some code illustrating it:

data A;
input name :$200. gender :$8. state :$20.;
x = compress(name||gender||state);
y = CRCXX1(x);
put x= y=32. ;
datalines;
Churchill,Alan Male Colorado
Churchill,John Male Colorado
;
run;

The results:

data A;
884  data A;
885  input name :$200. gender :$8. state :$20.;
886  x = compress(name||gender||state);
887  y = CRCXX1(x);
888  put x= y=32. ;
889  datalines;

x=Churchill,AlanMaleColorado y=1558070123
x=Churchill,JohnMaleColorado y=837584169
NOTE: The data set WORK.A has 2 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


892  ;
893  run;

This could be very valuable for situations where you need to tighten up processing and have some throwaway field values. The person who mentioned the undocumented function says it is good to about 1 million unique values before it starts to have collisions. Above that, go with the MD5 function.

SAS throwing RPC error

If you are doing code in C#  and get this error when creating a LanguageService: The RPC server is unavailable. (Exception from HRESULT:...