Ridgeline plot

What’s this?

Macro for generate ridgeline plot using SAS GRAPH.

Ridgeline plot shows the distribution of response variable each groups. the distribution is estimated by KDE (Kernel Density Estimation) using proc kde.

rugplot and several descriptive statistics (mean, q1, q2, q3) can be overlaid on the ridgeline plot.

Input data

key

variable

type

1

category

numeric or string

2

group response

numeric or string numeric

group variable is optional.

I recommended that the variable type of category and group are set to numeric with format. if the variable type is string, item order is defined as ascending character order.

Syntax

ods graphics / < graphics option > ;
ods listing gpath=< output path >;

%macro ridgeline(
   data=,
   x=,
   y=,
   group=None,
   xlabel=x,
   ylabel=y,
   yticks=,
   cat_iv = 1.2,
   gridsize = 401,
   bw_method = sjpi,
   bw_adjust = 1,

   quartile=False,
   mean=False,

   legend = False,
   grouplegendtitle=group,

   rug = False,
   ruglength = 2,

   fillstyle=None,
   qgradient=1,
   palette=sns,
   note=,
   deletedata=True);

Parameters

  • data : dataset name (required)

    input data. keep, rename and where options are available.

  • x : variable name (required)

    category variable

  • y : variable name (required)

    response variable

  • group : variable name (optional)

    group variable for grouping data at each category.

    when the parameter is not set, all of outline color is set black.

    when the parameter is set category variable, the graph object of each category is set different color.

    default is “None”.

  • xlabel : string (optional)

    label string of category axis. default is the label of x variable. when the label is not displayed , set like below.

    xlabel=,

  • ylabel : string (optional)

    optional. label string of response axis. default is the label of y variable. when the label is not displayed , set like below.

    ylabel=,

  • yticks : numeric list (required)

    tickvalue list of response axis. the list is set as the numeric list separated by space . the item of the list should be set ascending order.

    ex. yticks = 10 20 30 40,

  • cat_iv : numeric (optional)

    the interval of category. the default is 1.2. when this parameter is set below 1, each density may be overlapped.

  • gridsize : integer (optional)

    the number of KDE grid size. default is 401 (the default of proc kde)

  • bw_method : keyword (optional)

    the bandwidth estimation method of KDE. default is “sjpi” (the default of proc kde).

    method keyword described below is available.

    • sjpi (Sheather-Jones plug-in)

    • snr (simple normal reference)

    • snrq (simple normal reference that uses the interquartile range)

    • srot (Silverman’s rule of thumb)

    • os (oversmoothed)

  • bw_adjust : numeric (optional)

    the bandwidth multiplier. Increasing will make the curve smoother. the default is 1.

  • quartile : bool (optional)

    if this parameter is set “True”, quartile lines will be displayed as dashed lines each density. the default is “False”.

  • mean : bool (optional)

    if this parameter is set “True”, mean lines will be displayed as solid lines each density. the default is “False”.

  • rug : bool (optional)

    if “True” the rugplot is displayed. the default is “False”

  • ruglength : numeric (optional)

    the scale factor of the rugplot.when rug parameter is set “False”, this parameter will be ignored. the default is 1. if set larger value, the rugplot is prolonged.

  • legend : bool (optional)

    if “True” the legend of group item is displayed. if group parameter is “None”, the parameter will be ignored. default is “False”.

  • grouplegendtitle : text (optional)

    the title of group legend. if legend parameter is set “True”, this parameter will be ignored. default is the label of group variable.

  • fillstyle : keyword (optional)

    the fill style of density plot. the style keyword described below is available. if group parameter is not set, this parameter can be set “None” only.

    • None: fill is not displayed.

    • Group: fill color of each density will be defined based on group parameter.

    • Quartile: fill color will be defined based on the quartiles of each x variable.

  • qgradient : 1 to 7 (optional)

    the color gradient pattern of quartile. this parameter can be set gradient ID (1 to 7). default is “1”.

  • palette : keyword (optional)

    color palette for fill, line and markers. the palettes described below is available. see color palette section of introduction page. default is “SNS” (Seaborn default palette).

    • SAS

    • SNS (Seaborn)

    • STATA

    • TABLEAU

  • note : statement (optional)

    insert the text entry statement into the graph template and display the title or footnote in the output image. default is “” (not displayed)

  • deletedata : bool (optional)

    if True, the temporary datasets and catalogs generated by macros will be deleted at the end of execution. default is True.

example

output example can be executed using following code after loading SAS plotter.

code

ods listing gpath="your output path";
filename exam url "https://github.com/Superman-jp/SAS_Plotter/raw/main/example/ridgeline_example.sas" encoding='UTF-8';
%include exam;

simple ridgeline plot

raw data

data raw;
infile datalines delimiter=',';
length col1 $10;
format date yymmdd10.;
input col1 $  tokyo  naha ;
date = input(col1,yymmdd10.);
drop col1;

datalines;
2023/1/1,13,19.8
2023/1/2,12.1,19.4
2023/1/3,11,21.4
2023/1/4,11,19.5
2023/1/5,10.6,20.4
2023/1/6,9.9,21.7
2023/1/7,10.4,19.4
2023/1/8,12.5,21.2
2023/1/9,13.9,22.5
2023/1/10,9.9,22.8
2023/1/11,10.7,23.3
2023/1/12,12.6,24.3
2023/1/13,14,25.4
2023/1/14,14.2,25.3
2023/1/15,12,22.6
2023/1/16,7.7,17.9
2023/1/17,7.6,19.1
2023/1/18,12.4,17.9
2023/1/19,9.1,18.6
2023/1/20,12.3,18.7
2023/1/21,9,20.7
2023/1/22,8.9,24.1
2023/1/23,6.1,22.7
2023/1/24,11.8,20.2
2023/1/25,3.7,15.5
2023/1/26,8.5,20.6
2023/1/27,6.6,17.5
2023/1/28,8.3,14.8
2023/1/29,8.8,16.1
2023/1/30,10,16.6
2023/1/31,9,19.5
2023/2/1,13.1,23.2
2023/2/2,9.2,21.6
2023/2/3,6.2,19.8
2023/2/4,11.2,18.6
2023/2/5,12,21.3
2023/2/6,13.6,22.7
2023/2/7,15.4,22.9
2023/2/8,11.7,22.9
2023/2/9,10.6,23
2023/2/10,3.5,23.6
2023/2/11,14.1,22.8
2023/2/12,16.9,25
2023/2/13,10.3,25.6
2023/2/14,10.7,19.9
2023/2/15,7.8,17.2
2023/2/16,9.6,21.2
2023/2/17,10.8,22.5
2023/2/18,15,23.6
2023/2/19,18.5,24.2
2023/2/20,14.7,19
2023/2/21,9.2,19.1
2023/2/22,10.4,22.1
2023/2/23,14.4,23.2
2023/2/24,12.1,23.5
2023/2/25,12.7,19.4
2023/2/26,10.7,17.9
2023/2/27,15,18.3
2023/2/28,19.4,21.8
2023/3/1,19.4,23.5
2023/3/2,20,19.2
2023/3/3,13.4,18.1
2023/3/4,17.4,20.9
2023/3/5,13.1,21.1
2023/3/6,15,21.1
2023/3/7,18.8,22
2023/3/8,21.3,23.9
2023/3/9,22.2,24
2023/3/10,22.9,22.9
2023/3/11,20.8,24.3
2023/3/12,18.8,24
2023/3/13,16.7,19.3
2023/3/14,14.6,20.7
2023/3/15,17.9,24.1
2023/3/16,20.9,24.5
2023/3/17,15.1,24.8
2023/3/18,9.5,22.8
2023/3/19,15.5,22.9
2023/3/20,19.2,21.7
2023/3/21,18.5,25.9
2023/3/22,23.8,26.3
2023/3/23,19.7,26.9
2023/3/24,25,27.3
2023/3/25,14.1,22.9
2023/3/26,12.5,20.3
2023/3/27,16.9,21.7
2023/3/28,13.4,23.9
2023/3/29,18.6,23.1
2023/3/30,19.7,23.2
2023/3/31,20.1,24.4
2023/4/1,23.3,24.3
2023/4/2,19,23.8
2023/4/3,18.4,25.2
2023/4/4,21,25
2023/4/5,21.8,24.7
2023/4/6,23.4,25.9
2023/4/7,21.3,23.8
2023/4/8,21.1,20.8
2023/4/9,17.7,21.9
2023/4/10,21.6,23.7
2023/4/11,25,26.3
2023/4/12,24.5,26.4
2023/4/13,21.3,24.9
2023/4/14,22.5,26.5
2023/4/15,18,26
2023/4/16,24.5,27.2
2023/4/17,20.3,28.7
2023/4/18,21,27.1
2023/4/19,24.9,26.5
2023/4/20,26,25.8
2023/4/21,26.8,26.2
2023/4/22,18.4,25.2
2023/4/23,20.5,25.8
2023/4/24,16,26.2
2023/4/25,19.1,27.1
2023/4/26,19.7,22.3
2023/4/27,23.8,24.4
2023/4/28,24.1,27
2023/4/29,23.8,28.4
2023/4/30,22,25.3
2023/5/1,22.5,24.6
2023/5/2,23,25.4
2023/5/3,23.4,25.8
2023/5/4,25.7,27.4
2023/5/5,26.6,27.5
2023/5/6,27.9,28.1
2023/5/7,22.3,26.6
2023/5/8,16.6,23.4
2023/5/9,20.6,25
2023/5/10,25,25.4
2023/5/11,23.4,25.7
2023/5/12,21.9,26.1
2023/5/13,18.1,26.6
2023/5/14,21.2,23.5
2023/5/15,18.1,25.3
2023/5/16,27.5,28.2
2023/5/17,31.6,27.6
2023/5/18,32.2,27.8
2023/5/19,24.5,27
2023/5/20,22.8,27
2023/5/21,26.3,27.4
2023/5/22,29.4,28.5
2023/5/23,17.2,25.2
2023/5/24,23.9,26.4
2023/5/25,23.8,28
2023/5/26,24.9,27.6
2023/5/27,26.7,27.3
2023/5/28,25.8,27.8
2023/5/29,21.3,30
2023/5/30,26.3,29.4
2023/5/31,22.2,29.6
2023/6/1,26.2,26.6
2023/6/2,23.8,26.5
2023/6/3,24,27.7
2023/6/4,27.5,29.3
2023/6/5,29.8,30.5
2023/6/6,25.1,29.1
2023/6/7,29.6,30.1
2023/6/8,28.7,29.9
2023/6/9,22.2,31.2
2023/6/10,26,30.2
2023/6/11,22.8,30.9
2023/6/12,22.5,29.4
2023/6/13,29.4,28
2023/6/14,25.8,25.8
2023/6/15,24.1,28
2023/6/16,30.7,28.5
2023/6/17,30.9,27.6
2023/6/18,31.1,30.3
2023/6/19,27.7,28.1
2023/6/20,27,30
2023/6/21,27.2,30.5
2023/6/22,22.6,29.9
2023/6/23,24.9,30.2
2023/6/24,29.5,31.9
2023/6/25,30.9,31.4
2023/6/26,30.4,31.5
2023/6/27,29.6,31.7
2023/6/28,32.3,30.7
2023/6/29,33.8,32.8
2023/6/30,30.5,32.3
2023/7/1,27.8,32
2023/7/2,31.7,32.5
2023/7/3,32.2,32.7
2023/7/4,30.6,32
2023/7/5,28.7,32.3
2023/7/6,33.4,32.6
2023/7/7,34.2,33.3
2023/7/8,30.9,32.8
2023/7/9,33,33.5
2023/7/10,36.5,34.1
2023/7/11,34.3,33.8
2023/7/12,37.5,34.2
2023/7/13,30.3,33.9
2023/7/14,30.5,33.7
2023/7/15,32.9,32.8
2023/7/16,35.3,32.7
2023/7/17,36.2,31.5
2023/7/18,37.5,32.8
2023/7/19,33.9,33.1
2023/7/20,32.6,33
2023/7/21,31.6,33.9
2023/7/22,33.4,33.9
2023/7/23,33.4,34.3
2023/7/24,35.7,33.2
2023/7/25,36.6,32.9
2023/7/26,37.7,31.3
2023/7/27,36.9,29.6
2023/7/28,36.2,32
2023/7/29,35.7,32.3
2023/7/30,36.6,32.6
2023/7/31,36.1,30.9
;
run;

proc format ;
value regionf
1="Tokyo"
2="Naha";
run;

data max_temp;
set raw;
format region regionf.;
label max_temp="maximum temperature (degree Celsius)"
      month="Month"
   region="Region";

month=month(date);
year=year(date);
region=1; max_temp=tokyo;output;
region=2; max_temp=naha; output;
keep year month  date region max_temp;
run;

code

title "basic ridgeline plot";
ods graphics / height=15cm width=15cm imagefmt=png imagename="ridge_simple" ;
%ridgeline(
   data=max_temp(where=(region=1)),
   x=month,
   y=max_temp,

   yticks=0 5 10 15 20 25 30 35 40,
   note=%nrstr(entrytitle 'your title here';
            entryfootnote halign=left 'your footnote here';
            entryfootnote halign=left 'your footnote here 2';)
               );

output

_images/ridge_simple1.svg

grouped ridgeline plot

if group parameter is set, density outline color will be defined based by group and fill appearance will be available.

code

title "grouped ridgeline plot";
ods graphics / height=15cm width=25cm imagefmt=png imagename="ridge_group" ;
%ridgeline(
   data=max_temp,
   x=month,
   y=max_temp,
   group=region,
   yticks=0 5 10 15 20 25 30 35 40,
   legend=true
   );

output

_images/ridge_group1.svg

ridgeline plot with statistics and individual value

this macro is supported displaying statistics, mean and quartile. Mean and quartile is displayed as line. the individual value is displayed as rug plot.

code

title "ridgeline plot with statistics";
ods graphics /reset=all height=15cm width=25cm imagename="ridgeline_stat" imagefmt=png;

%ridgeline(
   data=max_temp(where=(region=1)),
   x=month,
   y=max_temp,
   group=region,
   yticks=0 5 10 15 20 25 30 35 40,
   legend=true,
   mean=true,
   quartile=true,
   rug=true,
   deletedata=false
   );

output

_images/ridgeline_stat1.svg

fill style

This macro is supported three fill appearance, None, Group, and Quartile.

Group appearance and Quartile appearance is available when group parameter is set.

group fill style

code

title "ridgeline plot with fill style";
ods graphics /reset=all height=15cm width=25cm imagename="ridgeline_fill" imagefmt=png;

%ridgeline(
   data=max_temp,
   x=month,
   y=max_temp,
   group=region,
   yticks=0 5 10 15 20 25 30 35 40,
   fillstyle=group,
   legend=true,
   mean=false,
   quartile=true,
   rug=true
   );

output

_images/ridgeline_fill1.svg

quartile fill style

code

title "ridgeline plot with quartile style";
ods graphics /reset=all height=15cm width=25cm imagename="ridgeline_qtr" imagefmt=png;

%ridgeline(
   data=max_temp (where=(region=1)),
   x=month,
   y=max_temp,
   group=region,
   yticks=0 5 10 15 20 25 30 35 40,
   fillstyle=quartile,
   legend=true,
   mean=false,
   quartile=true,
   rug=true
   );
_images/ridgeline_qtr1.svg

quartile gradient style are as follows.

_images/ridgeline_qgradient11.svg_images/ridgeline_qgradient21.svg_images/ridgeline_qgradient31.svg_images/ridgeline_qgradient41.svg_images/ridgeline_qgradient51.svg_images/ridgeline_qgradient61.svg_images/ridgeline_qgradient71.svg