MultiHistogram

What’s this?

Macro for multihistogram using SAS GRAPH.

multihitogram is histogram created by each category variable and pair variable. Box width of histogram is reflect the response variable. multihistogram is used for frequency comparison of multiple category in small display area.

This macro was designed based on the report of Wierenga, Madison R et al. 1

Input data

Input data should be summarized data. To summarize raw data, proc univariate is useful. the format of category, pair, and level variable is required.

key

variable

type

1

category

numeric

2

pair(optional)

numeric

3

level

numeric

response

numeric

Syntax

ods graphics / < graphics option > ;
ods listing gpath=< output path >;

%macro multihistogram(
   data=,
   group=,
   pair=none,
   level=,
   levelfmt=,
   response=,
   cattitle=#,
   leveltitle=#,
   pairtitle=#,
   responsetxt=false,
   orient=v,
   pairsplit=false,
   legend=true,
   palette=sns,
   note=,
   deletedata=True
   );

Parameters

  • data : dataset name (required)

    input data. where and rename options are available but keep option is not.

  • category : variable name (required)

    category variable.

  • pair : variable name (optional)

    binomial variable such as YES/NO or Male/Female etc. default is “None”.

  • level : variable name (required)

    level variable.

  • levelfmt : format name (required)

    format of level variable.

  • response : variable name (required)

    response variable such as frequency or percentage.

  • cattitle : text (optional)

    label of category axis. default is label of category variable.

  • leveltitle : text (optional)

    label of level axis. default is label of level variable.

  • pairtitle : text (optional)

    label of pair legend. default is label of pair variable.

  • responsetxt : bool (optional)

    if True the response value will be displayed. default is “false”.

  • orient : keyword (optional)

    Define the orientation of graph, vertical(v) or horizontal(h). default is “v”.

  • pairsplit : bool (optional)

    if True the histogram will be split by pair variable. default is “False”.

  • legend : bool (optional)

    if “True” the legend of pair variable is displayed.if pair parameter is “None”, this parameter will be ignored. default is “True”.

  • palette : keyword (optional)

    color palette for fill, line and markers. the palettes described below is available. see color palette section of introduction page. default is “SNS” (Seaborn default palette).

    • SAS

    • SNS (Seaborn)

    • STATA

    • TABLEAU

  • note : statement (optional)

    insert the text entry statement into the graph template and display the title or footnote in the output image. default is “” (not displayed)

  • deletedata : bool (optional)

    if True, the temporary datasets and catalogs generated by macros will be deleted at the end of execution. default is True.

example

output example can be executed using following code after loading SAS plotter.

code

ods listing gpath="your output path";
filename exam url "https://github.com/Superman-jp/SAS_Plotter/raw/main/example/multihistogram_example.sas" encoding='UTF-8';
%include exam;

Basic multiple histogram

before use, the aggregation of data is required.

raw data

proc format;
value regionf
   1="Region 1"
   2="Region 2"
   ;
value eyecolorf
   1="blue"
   2="brown"
   3="green"
   ;
value haircolorf
   1="black"
   2="dark"
   3="fair"
   4="medium"
   5="red";
run;

data Color;
format region regionf. eyes eyecolorf. hair haircolorf.;
input Region Eyes Hair Count @@;
label Eyes  ='Eye Color'
      Hair  ='Hair Color'
      Region='Geographic Region';

datalines;
1 1 3 23  1 1 5 7   1 1 4 24
1 1 2 11  1 3 3 19  1 3 5 7
1 3 4 18  1 3 2 14  1 2 3 34
1 2 5 5   1 2 4 41  1 2 2 40
1 2 1 3   2 1 3 46  2 1 5 21
2 1 4 44  2 1 2 40  2 1 1 6
2 3 3 50  2 3 5 31  2 3 4 37
2 3 2 23  2 2 3 56  2 2 5 42
2 2 4 53  2 2 2 54  2 2 1 13
;
proc sort; by region eyes hair;
run;

data dummy;
do region =1 to 2;
do eyes = 1 to 3;
do hair = 1 to 5;
output;
end;
end;
end;
run;

data freq;
merge dummy color;
by region eyes hair;
if count=. then count=0;
run;

code

ods graphics / height=15cm width=15cm imagefmt=png imagename="multihisto_basic_v" ;
title "basic multiple histogram(vertical)";

%multihistogram(
   data=freq(where=(region=1)),
   category=eyes,
   level=hair,
   response=count,
   levelfmt =haircolorf,
   note=%nrstr(entrytitle 'your title here';
            entryfootnote halign=left 'your footnote here';
            entryfootnote halign=left 'your footnote here 2';)
);
_images/multihisto_basic_v1.svg

orientation

when orient parameter is set “h”, histogram is displayed in horizontal Orientation.

code

ods graphics / height=15cm width=15cm imagefmt=png imagename="multihisto_basic_h" ;
title "basic multiple histogram(horizontal)";

%multihistogram(
   data=freq(where=(region=1)),
   category=eyes,
   level=hair,
   response=count,
   orient=h,
   levelfmt =haircolorf
);
_images/multihisto_basic_h1.svg

split mode

Pair parameter is optional. If pair parameter is not set, the bar color will be defined based on the level. when normal mode (pairsplit=false), the bar of histogram will be extended side by side symmetrically like violin plot. whereas when split mode (pairsplit=true), the symmetrical histogram will be separated based on pair variable.

code

ods graphics / height=15cm width=15cm imagefmt=png imagename="multihisto_pair" ;
title " multiple histogram using pair variable";

%multihistogram(
   data=freq,
   category=eyes,
   pair=region,
   level=hair,
   response=count,
   levelfmt =haircolorf
);
_images/multihisto_pair1.svg

code

ods graphics / height=15cm width=15cm imagefmt=png imagename="multihisto_split" ;
title "multiple histogram (split mode)";
%multihistogram(
   data=freq,
   category=eyes,
   pair=region,
   level=hair,
   response=count,
   pairsplit=true,
   levelfmt =haircolorf
);
_images/multihisto_pairsplit.svg

display response value

Basic multiple histogram is difficult to read the response value from the plot because there is not response variable axis. but if responsetxt parameter is set “true”. the response value will be displayed.

code

ods graphics / height=15cm width=15cm imagefmt=png imagename="multihisto_restxt" ;
title "basic multiple histogram with response value";
%multihistogram(
   data=freq,
   category=eyes,
   pair=region,
   level=hair,
   response=count,
   responsetxt=true,
   levelfmt =haircolorf
);
_images/multihisto_restxt1.svg

Reference

1

Madison R. Wierenga, Ciera R. Crawford, and Cordelia A. Running. Older us adults like sweetened colas, but not other chemesthetic beverages. Journal of texture studies, 51(5):722–732, 2020. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8140601/.