select highest correlated pairs from a matrix SAS
I have a dataset like this
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
I build a correlation matrix like this:
proc corr
data = have
out = correlation_matrix
(where = (_TYPE_ = "CORR"))
noprint;
run;
I've attempted to think out loud some code which would achieve something similar to what I'm looking for, the syntax nor the logic is correct but I'm having a go at describing what I'm looking for
proc sort
data = correlation_matrix
by _NAME_;
run;
data _temp;
set correlation_matrix;
array col[*] _numeric_;
by _NAME_;
do i = 1 to dim(col);
if col(i) > 0.6 then do;
%let list = append(vname(col));
end;
run;
And from the correlation matrix, I'm looking for a way to return pairs which have a correlation of 60% or is above some threshold, the pairs I will then use to build scatter plot / histogram matrices like this
proc contents;
data = high_correlation_pairs
out = contents
noprint;
run;
proc sort
data = contents
nodupkey;
by name;
run;
proc sql noprint;
select name INTO: highly_correlated_pairs
separated by " "
from contents
;
quit;
ODS GRAPHICS /
IMAGEMAP=OFF;
OPTIONS VALIDVARNAME=ANY;
PROC SGSCATTER
DATA=have;
TITLE "Scatter Plot Matrix";
FOOTNOTE;
MATRIX &highly_correlated_pairs
/
DIAGONAL=(HISTOGRAM )
START=TOPLEFT
NOLEGEND
;
RUN;
TITLE; FOOTNOTE;
I'm just unsure how to select variables from the matrix which have a pair of over 60% correlation, it could even by a by NAME return the columns with corr over 60%
arrays select macros sas select-into
add a comment |
I have a dataset like this
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
I build a correlation matrix like this:
proc corr
data = have
out = correlation_matrix
(where = (_TYPE_ = "CORR"))
noprint;
run;
I've attempted to think out loud some code which would achieve something similar to what I'm looking for, the syntax nor the logic is correct but I'm having a go at describing what I'm looking for
proc sort
data = correlation_matrix
by _NAME_;
run;
data _temp;
set correlation_matrix;
array col[*] _numeric_;
by _NAME_;
do i = 1 to dim(col);
if col(i) > 0.6 then do;
%let list = append(vname(col));
end;
run;
And from the correlation matrix, I'm looking for a way to return pairs which have a correlation of 60% or is above some threshold, the pairs I will then use to build scatter plot / histogram matrices like this
proc contents;
data = high_correlation_pairs
out = contents
noprint;
run;
proc sort
data = contents
nodupkey;
by name;
run;
proc sql noprint;
select name INTO: highly_correlated_pairs
separated by " "
from contents
;
quit;
ODS GRAPHICS /
IMAGEMAP=OFF;
OPTIONS VALIDVARNAME=ANY;
PROC SGSCATTER
DATA=have;
TITLE "Scatter Plot Matrix";
FOOTNOTE;
MATRIX &highly_correlated_pairs
/
DIAGONAL=(HISTOGRAM )
START=TOPLEFT
NOLEGEND
;
RUN;
TITLE; FOOTNOTE;
I'm just unsure how to select variables from the matrix which have a pair of over 60% correlation, it could even by a by NAME return the columns with corr over 60%
arrays select macros sas select-into
add a comment |
I have a dataset like this
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
I build a correlation matrix like this:
proc corr
data = have
out = correlation_matrix
(where = (_TYPE_ = "CORR"))
noprint;
run;
I've attempted to think out loud some code which would achieve something similar to what I'm looking for, the syntax nor the logic is correct but I'm having a go at describing what I'm looking for
proc sort
data = correlation_matrix
by _NAME_;
run;
data _temp;
set correlation_matrix;
array col[*] _numeric_;
by _NAME_;
do i = 1 to dim(col);
if col(i) > 0.6 then do;
%let list = append(vname(col));
end;
run;
And from the correlation matrix, I'm looking for a way to return pairs which have a correlation of 60% or is above some threshold, the pairs I will then use to build scatter plot / histogram matrices like this
proc contents;
data = high_correlation_pairs
out = contents
noprint;
run;
proc sort
data = contents
nodupkey;
by name;
run;
proc sql noprint;
select name INTO: highly_correlated_pairs
separated by " "
from contents
;
quit;
ODS GRAPHICS /
IMAGEMAP=OFF;
OPTIONS VALIDVARNAME=ANY;
PROC SGSCATTER
DATA=have;
TITLE "Scatter Plot Matrix";
FOOTNOTE;
MATRIX &highly_correlated_pairs
/
DIAGONAL=(HISTOGRAM )
START=TOPLEFT
NOLEGEND
;
RUN;
TITLE; FOOTNOTE;
I'm just unsure how to select variables from the matrix which have a pair of over 60% correlation, it could even by a by NAME return the columns with corr over 60%
arrays select macros sas select-into
I have a dataset like this
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
I build a correlation matrix like this:
proc corr
data = have
out = correlation_matrix
(where = (_TYPE_ = "CORR"))
noprint;
run;
I've attempted to think out loud some code which would achieve something similar to what I'm looking for, the syntax nor the logic is correct but I'm having a go at describing what I'm looking for
proc sort
data = correlation_matrix
by _NAME_;
run;
data _temp;
set correlation_matrix;
array col[*] _numeric_;
by _NAME_;
do i = 1 to dim(col);
if col(i) > 0.6 then do;
%let list = append(vname(col));
end;
run;
And from the correlation matrix, I'm looking for a way to return pairs which have a correlation of 60% or is above some threshold, the pairs I will then use to build scatter plot / histogram matrices like this
proc contents;
data = high_correlation_pairs
out = contents
noprint;
run;
proc sort
data = contents
nodupkey;
by name;
run;
proc sql noprint;
select name INTO: highly_correlated_pairs
separated by " "
from contents
;
quit;
ODS GRAPHICS /
IMAGEMAP=OFF;
OPTIONS VALIDVARNAME=ANY;
PROC SGSCATTER
DATA=have;
TITLE "Scatter Plot Matrix";
FOOTNOTE;
MATRIX &highly_correlated_pairs
/
DIAGONAL=(HISTOGRAM )
START=TOPLEFT
NOLEGEND
;
RUN;
TITLE; FOOTNOTE;
I'm just unsure how to select variables from the matrix which have a pair of over 60% correlation, it could even by a by NAME return the columns with corr over 60%
arrays select macros sas select-into
arrays select macros sas select-into
asked Nov 13 '18 at 11:22
7828221978282219
124
124
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You can get the pairs like this - the key is the vname
function, which returns the name of an array element:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
Perhaps from there you can work out the rest?
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Edit: to include complete answer:
PROC TRANSPOSE is used to transpose the correlation matric to x,y pairs and subset to correlations of interest. A macro variable is created to use in PROC SGSCATTER.
NOTE: PLOTREQUESTS=x1*x2 x1*y x2*x3 x2*y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279943%2fselect-highest-correlated-pairs-from-a-matrix-sas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can get the pairs like this - the key is the vname
function, which returns the name of an array element:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
Perhaps from there you can work out the rest?
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
add a comment |
You can get the pairs like this - the key is the vname
function, which returns the name of an array element:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
Perhaps from there you can work out the rest?
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
add a comment |
You can get the pairs like this - the key is the vname
function, which returns the name of an array element:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
Perhaps from there you can work out the rest?
You can get the pairs like this - the key is the vname
function, which returns the name of an array element:
data high_corrs;
set correlation_matrix;
array coefs i--x3;
length var1 var2 $32.;
do j = 1 to dim(coefs);
corr = coefs(j);
if _n_ < j and corr > 0.6 then do;
var1 = vname(coefs(_n_));
var2 = vname(coefs(j));
output;
end;
end;
keep var1 var2 corr;
run;
Perhaps from there you can work out the rest?
answered Nov 13 '18 at 14:10
user667489user667489
7,96721227
7,96721227
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
add a comment |
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
thanks, I'll get looking at this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Edit: to include complete answer:
PROC TRANSPOSE is used to transpose the correlation matric to x,y pairs and subset to correlations of interest. A macro variable is created to use in PROC SGSCATTER.
NOTE: PLOTREQUESTS=x1*x2 x1*y x2*x3 x2*y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Edit: to include complete answer:
PROC TRANSPOSE is used to transpose the correlation matric to x,y pairs and subset to correlations of interest. A macro variable is created to use in PROC SGSCATTER.
NOTE: PLOTREQUESTS=x1*x2 x1*y x2*x3 x2*y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Edit: to include complete answer:
PROC TRANSPOSE is used to transpose the correlation matric to x,y pairs and subset to correlations of interest. A macro variable is created to use in PROC SGSCATTER.
NOTE: PLOTREQUESTS=x1*x2 x1*y x2*x3 x2*y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
Edit: to include complete answer:
PROC TRANSPOSE is used to transpose the correlation matric to x,y pairs and subset to correlations of interest. A macro variable is created to use in PROC SGSCATTER.
NOTE: PLOTREQUESTS=x1*x2 x1*y x2*x3 x2*y
data have;
do i = 1 to 1000;
y = ranuni(0);
x1 = y ** 2;
x2 = x1 ** 3;
x3 = x2 - x1/2;
output;
end;
run;
proc corr data=have out=corr noprint;
run;
proc transpose name=with data=corr out=pair(where=(.6 le abs(col1) lt 1));
where _type_ eq 'CORR';
by _name_ notsorted;
run;
data pairV / view=pairv;
set pair;
call sortc(_name_,with);
run;
proc sort data=pairv out=pair2 nodupkey;
by _name_ with;
run;
proc sql noprint;
select catx('*',_name_,with) into :plotrequests separated by ' ' from pair2;
quit;
%put NOTE: &=plotrequests;
proc sgscatter data=have;
plot &plotrequests;
run;
quit;
edited Nov 14 '18 at 18:00
answered Nov 13 '18 at 14:11
data _null_data _null_
5,342610
5,342610
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
Thanks, I'll start building on this
– 78282219
Nov 14 '18 at 13:46
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279943%2fselect-highest-correlated-pairs-from-a-matrix-sas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown