SAS vs R Programming: Which to Choose and How to Switch

Reading time:

time

min

August 25, 2022

<p><b>UPDATED</b> in April, 2025.</p>
<p>SAS is losing ground across industries as R and the Shiny package gain popularity. Shiny gives users custom interactivity on top of their R routines, which makes it attractive for data teams. This article focuses on SAS vs R in the pharmaceutical industry, but keep in mind that the insights apply to any data science team considering a switch in their analytics toolkit.</p>

<p>Today's data science teams need SAS alternatives that handle complex technical needs while providing interactive data storytelling for non-technical users. When evaluating options, the choice often comes down to Python vs R. And while drag-and-drop BI tools exist, they can't match the custom development, machine learning, and big data capabilities of programming languages.</p>

<blockquote><a href="https://appsilon.com/appsilon-data-science-is-now-an-rstudio-full-service-certified-partner/" target="_blank" rel="noopener">Appsilon is an Posit Full Service Certified Partner</a>. Find out how we can help you with R and Python development services and RStudio discounts.</blockquote>

<p>If your team uses Python and feels comfortable with it, we won't try to convert you. But if you see value in R for data analytics and statistical work, we recommend exploring what R can do for your organization.</p>

<p>Statistical analysis follows a structured process: defining the problem, collecting data, wrangling it, analyzing it, and communicating results. The analysis phase typically involves summarizing data with descriptive statistics and applying inferential methods through hypothesis testing and modeling. Results are usually shared through reports with visualizations to provide context, especially for those who weren't involved in the analysis process.</p>

<p>Analysts can choose from many tools to perform this work, but some solutions fit certain scenarios better than others. In this article, we'll compare two popular options: R and SAS, and help you decide which one makes the most sense for your specific needs.</p>

Understanding SAS and R Programming

SAS and R are both solid statistical software platforms used by researchers and data scientists to create data analyses and visualizations. Let's break down what each one offers.

What is SAS?

SAS is commercial software for advanced analytics, business intelligence, data management, and predictive modeling. You can use it through both a graphical interface and the SAS programming language.
‍

A SAS program consists of steps you submit for execution. Each step performs a specific task, and there are only two types:

DATA steps: where you create, import, modify, merge, or calculate data
PROC steps: a group of statements that call and execute a procedure, usually with a SAS data set as input. These procedures analyze data to produce statistics, tables, reports, and visualizations.

A SAS program can contain any combination of these steps. The exact structure depends on what you need to accomplish.

<h3>SAS program example</h3>
<p>Here's a simple example that uses SAS to compare group means. This showcases both the code structure and output format:</p>

<pre><code class="language-sas">* create example dataset;
data patients;
input patient_id treatment $ age;
cards;
1 a 24
2 a 23
3 a 25
4 b 30
5 b 36
6 b 34
;
run;

* compare group means;
ods graphics on;

proc ttest cochran ci=equal umpu;
class treatment;
var age;
run;

ods graphics off;</code></pre>

<p>In this example, we first create a dataset in the DATA step and then use a PROC step to perform our analysis. The output includes both tables and charts with standard styling.</p>

<h3>The SAS software suite</h3>
SAS isn't a single program but a suite of components for different data management and analysis needs:
‍

<a href="https://support.sas.com/en/software/base-sas-support.html" target="_blank" rel="noopener">Base SAS</a>: For data access, transformation, and reporting
<a href="https://support.sas.com/rnd/app/stat/index.html#s1=6" target="_blank" rel="noopener">SAS/STAT</a>: For statistical analysis
<a href="https://support.sas.com/en/software/sasgraph-support.html" target="_blank" rel="noopener">SAS/GRAPH</a>: For creating data visualizations
<a href="https://support.sas.com/rnd/app/iml/index.html#s1=6" target="_blank" rel="noopener">SAS/IML</a>: Interactive Matrix Language for implementing custom algorithms

<h3>What is R programming?</h3>
<p><a href="https://www.r-project.org/about.html" target="_blank" rel="noopener">R</a> is an open-source language and environment for statistical computing and graphics. It offers a wide range of techniques including linear and nonlinear modeling, statistical tests, time-series analysis, classification, and clustering. The most popular IDE for R is <a href="https://posit.co/download/rstudio-desktop/" target="_blank">RStudio</a> by <a href="https://posit.co/" target="_blank" rel="noopener">Posit PBC</a>.</p>

‍

<p>In R, data lives in objects that can store anything from simple values to complex datasets. You work with these objects by creating and applying functions. This function-centered approach gives R flexibility for tackling a wide range of analytical problems.</p>

<blockquote>Thinking about switching to R and Shiny? See why you might want to <a href="https://appsilon.com/why-you-should-use-r-shiny-for-enterprise-application-development/" target="_blank" rel="noopener">switch to R Shiny for enterprise application development</a>.</blockquote>

<h3>R packages</h3>
<p>One of R's biggest strengths is its package ecosystem. A package is a shareable collection of code that performs specific tasks. Some popular packages for data science include <a href="https://readr.tidyverse.org/" target="_blank" rel="noopener">readr</a> for importing data, <a href="https://dplyr.tidyverse.org/" target="_blank" rel="noopener">dplyr</a> for data manipulation, <a href="https://tidyr.tidyverse.org/" target="_blank" rel="noopener">tidyr</a> for data cleaning, and <a href="https://ggplot2.tidyverse.org/" target="_blank" rel="noopener">ggplot2</a> for visualization.</p>

<p>Anyone can create and share R packages through <a href="https://cran.r-project.org/" target="_blank" rel="noopener">CRAN</a> (Comprehensive R Archive Network), or build private packages for use within an organization. Currently, CRAN hosts over 18,000 publicly available packages!</p>

<p>Appsilon contributes to this ecosystem through our <a href="https://shiny.tools/" target="_blank" rel="noopener">Shiny tools</a> – packages that help developers build scalable, reproducible, and visually appealing Shiny applications.</p>

<h3>R program example</h3>
<p>Let's recreate the SAS example from earlier using R code to compare the syntax and output:</p>

# create example dataset
patients <- data.frame(
  patient_id = 1:6,
  treatment = rep(c("a", "b"), each = 3),
  age = c(24, 23, 25, 30, 36, 34)
)

# compare group means
t.test(age ~ treatment, data = patients)

<p>The R version accomplishes the same task with less code and a cleaner syntax. While R's default output is plain text, you can enhance it with tools like <a href="https://rmarkdown.rstudio.com/" target="_blank" rel="noopener">R Markdown</a> for reports or Shiny for interactive applications.</p>

<p>The styling possibilities with R are virtually limitless. Check out some of our <a href="https://demo.appsilon.com/">Shiny demos</a> to see what's possible with custom visualization and reporting.</p>

<h2 id="comparison">Core Feature Comparison</h2>
<p>When choosing between SAS and R, a couple of key factors will likelt influence your decision. Let's compare these platforms across cost, functionality, collaboration features, and access to new developments.</p>

<h3 id="cost">Cost of SAS vs R for data science teams</h3>
<p>The cost difference between these platforms is significant and often a deciding factor for many organizations.</p>

<p><strong>SAS</strong> is commercial software that requires paid licensing. SAS licenses are known to be expensive, which can make it difficult for individuals and small businesses to use or scale their data science operations. Enterprise licenses can run into six figures annually for larger deployments.</p>

<p><strong>R</strong> is completely free and open source. Anyone can download it and start using it right away without any upfront investment. This makes R accessible to everyone from students to large enterprises looking to build robust data science capabilities.</p>

<blockquote>Join the Shiny movement and develop your own <a href="https://appsilon.com/r-shiny-dashboard-templates/" target="_blank" rel="noopener">R Shiny dashboard in less than 10 minutes</a>!</blockquote>

<h3 id="functionality">Functionality comparison</h3>
<p>Both platforms accomplish similar goals but take different approaches to data analysis tasks:</p>

<table style="width: 100%;">
<tbody>
<tr>
<td><strong>SAS</strong></td>
<td><strong>R</strong></td>
</tr>
<tr>
<td><a href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1kh25to5o0wmvn1o4n4hsl3yyww.htm" target="_blank" rel="noopener">Data steps</a> for data manipulation</td>
<td>Function-based expressions for data transformation</td>
</tr>
<tr>
<td><a href="https://support.sas.com/learn/le/proc/index.html" target="_blank" rel="noopener">Procedures</a> for analysis</td>
<td>Function calls with customizable parameters</td>
</tr>
<tr>
<td><a href="https://www.tutorialspoint.com/sas/sas_macros.htm" target="_blank" rel="noopener">Macros</a> for reusable code</td>
<td>User-defined functions and packages</td>
</tr>
<tr>
<td><a href="https://www.tutorialspoint.com/sas/sas_functions.htm" target="_blank" rel="noopener">Built-in SAS functions</a></td>
<td><a href="https://www.w3schools.com/r/r_functions.asp" target="_blank" rel="noopener">R functions</a> from base R and packages</td>
</tr>
<tr>
<td><a href="https://www.tutorialspoint.com/sas/sas_output_delivery_system.htm" target="_blank" rel="noopener">Output Delivery System (ODS)</a></td>
<td><a href="https://rmarkdown.rstudio.com/" target="_blank" rel="noopener">R Markdown</a>, <a href="https://quarto.org/" target="_blank" rel="noopener">Quarto</a></td>
</tr>
</tbody>
</table>

<p>SAS uses a more procedural approach with distinct DATA and PROC steps, while R's functional programming style offers more flexibility. Both can handle complex statistical operations, but their syntax and workflow differ significantly.</p>

<h3 id="access">Access to new developments</h3>
<p>The speed at which each platform adopts new statistical methods and techniques varies considerably:</p>

<p>Open-source software adoption has increased dramatically in recent years. The collaborative nature of the R community allows for quicker implementation of cutting-edge methods. When researchers develop new statistical techniques, they often release R packages alongside their academic papers, making these methods immediately available to practitioners.</p>

<p>With R, you can see exactly how algorithms work since all code is open source. There's no guesswork about implementation details or whether a method is optimal for your specific case.</p>

<p>In contrast, new algorithms take longer to appear in SAS. This means <strong>advanced data science techniques might be available right now in R but not yet in SAS</strong>. For teams working at the cutting edge of statistics or machine learning, this difference can be crucial.</p>

<h3 id="collaboration">Collaboration capabilities</h3>
<p>How easily can you share your work with colleagues and collaborators?</p>

<p>With R, file sharing and collaboration are straightforward. If you want to share an analysis with a colleague, they don't need a license - they can simply download R (free) and run your code. This removes significant barriers to collaboration.</p>

<p>R also makes it easy to publish interactive dashboards to the web using Shiny. These dashboards can be shared with stakeholders who don't need to understand the underlying code to benefit from your analysis.</p>

<p>SAS sharing is more restricted. If you want to share SAS work with someone, they'll need their own SAS license to run your code. While SAS does offer some free limited versions, these require account setup and have restrictions that can impede smooth collaboration.</p>

<blockquote>Get your data story into the hands of colleagues quickly using these <a href="https://appsilon.com/how-to-share-r-shiny-apps/" target="_blank" rel="noopener">top 3 methods for sharing R Shiny apps</a>.</blockquote>

<p>The collaboration advantage of R becomes even more pronounced in academic settings, cross-organizational projects, or any situation where you can't guarantee all participants have access to expensive software licenses.</p>

<h2 id="practical">Building Your Data Science Team</h2>
<p>When you're building a data science team, your choice of technology stack affects who you can hire, how quickly they can get up to speed, and what tools they'll have at their disposal. Let's look at the practical aspects of staffing and equipping a team using either SAS or R.</p>

<h3 id="hiring">Hiring SAS vs R developers</h3>
<p>Over the past decade, universities have increasingly shifted from teaching SAS to teaching R. Even domain-specific statistics courses now commonly use R and train students on the RStudio IDE. This trend means the pool of R-skilled graduates continues to grow each year.</p>

<p>That said, R isn't as popular among developers as Python. According to the <a href="http://www.tiobe.com/tiobe-index/" target="_blank" rel="noopener">TIOBE Index</a>, Python currently ranks #1 in programming language popularity, while R sits at #16 and SAS trails at #26. If your priority is building a large team quickly and you already use Python for analytics, it might make sense to stick with that ecosystem.</p>

<p>It's worth noting that you can now use <a href="https://www.rstudio.com/blog/three-ways-to-program-in-python-with-rstudio/#:~:text=The%20RStudio%20IDE(opens%20in,would%20in%20an%20R%20script." target="_blank" rel="noopener">Python within RStudio</a>, giving you flexibility to mix languages based on your team's skills. If you need help setting up this kind of multi-language environment, consider reaching out to specialized consultants who can help with the configuration.</p>

<h3 id="learn">Learning curve and educational resources</h3>
<p>If you're just getting started, we recommend learning R first. It's free to access, easy to install, and has abundant learning resources. The barrier to entry is much lower than with SAS.</p>

<p><strong>R educational resources</strong> are plentiful and often free. Excellent books like <a href="https://rstudio-education.github.io/hopr/" target="_blank" rel="noopener">Hands-on Programming with R</a> and <a href="https://r4ds.had.co.nz/" target="_blank" rel="noopener">R for Data Science</a> are available online at no cost. You can find specialized resources for topics like <a href="https://bookdown.org/yihui/rmarkdown/" target="_blank" rel="noopener">reporting</a> and <a href="https://mastering-shiny.org/" target="_blank" rel="noopener">web application development</a>. Beyond books, there are webinars, forums, and a helpful community that makes the learning process smoother.</p>

<p><strong>SAS educational resources</strong> are more centralized. SAS offers <a href="https://www.sas.com/en_us/training/overview.html" target="_blank" rel="noopener">formal courses</a> to learn their software, along with extensive <a href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/pgmsaswlcm/home.htm" target="_blank" rel="noopener">documentation</a>. SAS also provides point-and-click interfaces like <a href="https://support.sas.com/en/software/enterprise-guide-support.html" target="_blank" rel="noopener">SAS Enterprise Guide</a> that don't require coding knowledge, which can be handy for team members who aren't programmers.</p>

<h3 id="packages">R Packages vs SAS tools</h3>
<p>As mentioned earlier, R packages can be developed by anyone. While this open approach doesn't guarantee quality, widely-used packages tend to be reliable and well-maintained. When bugs are discovered, the community often helps fix issues quickly, leading to rapid improvements.</p>

<p>The <a href="https://www.tidyverse.org/" target="_blank" rel="noopener">tidyverse</a> collection of R packages exemplifies this strength. These packages share a common design philosophy and work together seamlessly, creating a consistent experience for data manipulation, visualization, and modeling.</p>

<p>SAS development follows a more controlled approach. If your team identifies a problem in SAS, you'll need to report it to SAS and wait for an official fix in a future release. This process ensures stability but can delay access to improvements or bug fixes.</p>

<h3 id="support">Support options</h3>
<p>SAS provides official <a href="https://support.sas.com/en/technical-support.html" target="_blank" rel="noopener">Technical Support</a> and comprehensive <a href="https://support.sas.com/en/knowledge-base.html" target="_blank" rel="noopener">Documentation</a>. This formal support structure can be reassuring for enterprise teams that need guaranteed assistance.</p>

<p>R doesn't offer official technical support since it's open source. Instead, it has a large, active community you can reach out to through forums like <a href="https://stackoverflow.com/questions/tagged/r" target="_blank" rel="noopener">Stack Overflow</a> or the <a href="https://community.rstudio.com/" target="_blank" rel="noopener">Posit Community</a>. Most R packages are well-documented and include excellent tutorials (called Vignettes) with examples.</p>

<p>If you're coming from Python, you'll likely be pleasantly surprised by the quality of R documentation. The R community places a strong emphasis on clear documentation and examples (e.g., <a href="https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html" target="_blank" rel="noopener">dplyr vignette</a>, <a href="https://tidyr.tidyverse.org/articles/tidy-data.html" target="_blank" rel="noopener">tidyr vignette</a>).</p>

<h3>R Programming Outsourcing and Consulting</h3>
<p>If your team needs specialized expertise, there's a growing number of R-focused consultancies that can help. These firms specialize in developing complex Shiny applications and solving challenging data analysis problems.</p>

<p>At Appsilon, we build, maintain, and develop Shiny applications for enterprise clients worldwide. Our team provides enhanced capabilities beyond native Shiny, including improved scalability, security, and modern UI/UX through custom R packages. We've pioneered <a href="https://appsilon.com/how-we-built-a-shiny-app-for-700-users/" target="_blank" rel="noopener">Shiny innovations</a> that push the boundaries of what's possible with R.</p>

<p>As an <a href="https://appsilon.com/appsilon-data-science-is-now-an-rstudio-full-service-certified-partner/">Posit Full Service Certified Partner</a>, we help clients implement and scale Posit products to simplify data-driven decision making.</p>

<p>Some specialized services that R consultants can provide include:</p>

Rapid dashboard development
Full-stack engineering support
DevOps advisory for Posit products
Machine learning solutions
Advanced statistical modeling

<p>Working with consultants often accelerates R adoption in your team while making sure you follow best practices from the start.</p>

<h2 id="visualization">Data Visualization Capabilities</h2>
<p>Communicating insights is next to impossible without adequate data visualization. Let's compare how SAS and R handle creating charts, graphs, and visual reports that help stakeholders understand complex data.</p>

<h3 id="viz">Creating visualizations in R and SAS</h3>
<p>R provides a rich variety of packages for creating both static and interactive visualizations. The most popular options include <a href="https://ggplot2.tidyverse.org/" target="_blank" rel="noopener">ggplot2</a> for elegant static charts, <a href="https://plotly.com/r/getting-started/" target="_blank" rel="noopener">plotly</a> for interactive graphics, and <a href="https://www.appsilon.com/post/r-shiny-highcharts" target="_blank" rel="noopener">highcharter</a> for business-ready visuals. The flexibility of these tools allows for extensive customization to match your organization's branding and specific needs.</p>

<p>SAS visualization features are more limited and don't offer the same level of customization. While SAS can produce good-looking charts through SAS/GRAPH, the styling options are more restricted and typically follow SAS's standard visual templates.</p>

<h3>Visualization examples</h3>
<p>Let's compare how each platform handles a common visualization task: creating a histogram to show age distribution by treatment group.</p>

<h4>SAS Example</h4>
<p>Here's how you'd create a dataset and histogram in SAS:</p>

* create example dataset;
data patients;
input treatment $ age sex $;
cards;
a 24 m
a 23 m
a 25 m
a 21 m
a 22 f
a 22 f
a 23 f
a 28 f
a 21 f
a 20 f
a 29 f
a 18 f
a 30 f
a 23 f
a 25 f
a 24 f
a 23 f
a 25 f
b 30 f
b 36 f
b 34 f
b 31 f
b 32 m
b 32 m
b 34 m
b 33 m
b 34 m
b 30 m
b 28 m
b 33 m
b 40 m
b 22 m
b 29 m
;
run;

/*create histogram for age variable by treatment*/
proc univariate data=patients;
    class treatment;
    var age;
    histogram age / overlay;
run;

<p>This produces a functional histogram with SAS's default styling:</p>

<h4>R Example</h4>
<p>Now let's create the same visualization using R:</p>

# load libraries
library(tibble)
library(ggplot2)

# create example dataset
patients &lt;- tibble::tribble(
  ~treatment, ~age, ~sex,
  "a", 24, "m",
  "a", 23, "m",
  "a", 25, "m",
  "a", 21, "m",
  "a", 22, "f",
  "a", 22, "f",
  "a", 23, "f",
  "a", 28, "f",
  "a", 21, "f",
  "a", 20, "f",
  "a", 29, "f",
  "a", 18, "f",
  "a", 30, "f",
  "a", 23, "f",
  "a", 25, "f",
  "a", 24, "f",
  "a", 23, "f",
  "a", 25, "f",
  "b", 30, "f",
  "b", 36, "f",
  "b", 34, "f",
  "b", 31, "f",
  "b", 32, "m",
  "b", 32, "m",
  "b", 34, "m",
  "b", 33, "m",
  "b", 34, "m",
  "b", 30, "m",
  "b", 28, "m",
  "b", 33, "m",
  "b", 40, "m",
  "b", 22, "m",
  "b", 29, "m"
)

# create chart
ggplot(data = patients, aes(x = age, fill = treatment)) +
  geom_histogram(position = "identity", 
                 alpha = 0.5, 
                 bins = 9,
                 color = "black") +
  labs(
    title = "Distribution of age by treatment",
    x = "Age (years)",
    y = "Number of Patients",
    fill = "Treatment"
  ) +
  theme_minimal() +
  theme(
    legend.position = "top"
  )

<p>This produces a more polished visualization with cleaner aesthetics:</p>

<h3>Interactive visualizations</h3>
<p>Where R really shines is in creating interactive visualizations through Shiny applications. These interactive dashboards let users explore data in real-time, apply filters, zoom into areas of interest, and discover insights at their own pace.</p>

<p>Here's how R and Shiny can transform data visualization:</p>

Create dashboards that update in real-time as data changes
Allow users to filter, sort, and explore data without coding knowledge
Build custom applications tailored to specific business needs
Embed sophisticated statistical analysis alongside visualizations
Deploy to the web for easy sharing across your organization

<p>SAS does offer some interactive capabilities through SAS Visual Analytics, but these require additional licensing costs and don't match the flexibility of R's open-source tools.</p>

<h3>Reporting and document creation</h3>
<p>SAS uses the <a href="https://www.tutorialspoint.com/sas/sas_output_delivery_system.htm" target="_blank" rel="noopener">Output Delivery System (ODS)</a> to generate formatted reports in various formats like HTML, PDF, and RTF. This system provides decent formatting options but follows relatively strict templates.</p>

<p>R offers more flexible reporting through <a href="https://rmarkdown.rstudio.com/" target="_blank" rel="noopener">R Markdown</a> and <a href="https://quarto.org/" target="_blank" rel="noopener">Quarto</a>. These tools let you combine code, visualizations, and narrative text in a single document that can be rendered to multiple formats. You can create everything from quick reports to books, websites, and presentations using the same framework.</p>

<p>For teams that need to produce regular reports, R Markdown's parameterized reports feature is especially valuable. You can create report templates that automatically update with fresh data, saving hours of manual work each week.</p>

<blockquote>Need a Shiny dashboard now? <a href="https://templates.appsilon.com/" target="_blank" rel="noopener">Download our free Shiny templates and get started today</a>!</blockquote>

<p>In summary: Both platforms can create effective visualizations, but R offers more flexibility, customization options, and a more modern approach to interactive data exploration.</p>

<h2 id="clinical">Applications in Clinical Data Science</h2>
<p>The pharmaceutical and healthcare industries have traditionally relied heavily on SAS for regulatory submissions and clinical trials analysis. However, R is catching up, if not sometimes exceeding in these areas. Let's explore how each platform performs in clinical data science scenarios.</p>

<h3 id="use">Should you use SAS or R for clinical data science?</h3>
<p>SAS has long been the standard in clinical research due to its validation processes and acceptance by regulatory authorities. It excels at sequential processing and producing standardized outputs that comply with submission requirements.</p>

<p>R offers greater flexibility and is rapidly gaining traction in clinical settings. With recent successes by the R Consortium and increased collaboration with the FDA, R is moving toward higher standardization to meet regulatory demands. For many organizations, a hybrid approach makes sense during this transition period.</p>

<blockquote><a href="https://www.appsilon.com/post/pharma-fda-rejections" target="_blank">FDA Compliance in Software Development: Cases Where Poor Software Quality Led to Costly FDA Rejections</a></blockquote>

<h3 id="statement">Clinical analysis example</h3>
<p>Let's work through a realistic scenario: analyzing how different variables affect mortality from a specific disease. We want to understand the differences between treatment application times (no treatment, fast treatment, slow treatment) using logistic regression.</p>

<p>Our example dataset contains anonymized patient information with these variables:</p>

<strong>ID</strong>: Patient identifier
<strong>AGE</strong>: Patient age in years
<strong>SEX</strong>: Patient sex (F = Female, M = Male)
<strong>CHARLSON</strong>: <a href="https://en.wikipedia.org/wiki/Comorbidity#Charlson_index" target="_blank" rel="noopener">Charlson comorbidity score</a>
<strong>PITT</strong>: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7156778/#:~:text=The%20Pitt%20bacteremia%20score%20(PBS)%20is%20widely%20used%20in%20infectious,risk%20of%20death%20%5B9%5D." target="_blank" rel="noopener">Pitt bacteremia score</a>
<strong>SURVIVED</strong>: Indicator (1 = patient was cured)
<strong>DIED_OF_DISEASE</strong>: Indicator (1 = patient died from the disease)
<strong>DIED_OTHER</strong>: Indicator (1 = patient died from another cause)
<strong>UNKNOWN</strong>: Indicator (1 = patient status unknown)
<strong>TREATMENT</strong>: Indicator (1 = patient received treatment)
<strong>TREATMENT_FAST</strong>: Indicator (1 = treatment applied within 48 hours)

<p>We'll walk through the key steps of this analysis in both SAS and R:</p>

<h4>Step 1: Reading the data</h4>

<p><strong>SAS approach:</strong></p>
<pre><code class="language-sas">* READ DATA;
FILENAME REFFILE '/home/u4729884/data.csv';
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=WORK.RAW_DATA;
GETNAMES=YES;
RUN;</code></pre>

<p><strong>R approach:</strong></p>
<pre><code class="language-r"># Read data
raw_data <- read.csv("data.csv")</code></pre>

<p>Right away, you can see that R accomplishes this task with a single line of code, while SAS requires several lines with specific parameters.</p>

<h4>Step 2: data wrangling</h4>
<p>Now we need to prepare the data for analysis by:</p>

Removing patients with unknown status
Removing patients who died from other causes
Creating a treatment category variable
Creating indicator variables for age, Charlson score, and Pitt score
Selecting relevant columns for our model

<p><strong>SAS approach:</strong></p>
<pre><code class="language-sas">* DATA WRANGLING;
DATA WORK.DATA_TO_MODEL (KEEP = AGE_60
SEX
CHARLSON_4
PITT_3
TREATMENT_CATEGORY
DIED_OF_DISEASE); * KEEP VARS OF INTEREST;
SET WORK.RAW_DATA;

* REMOVE PATIENTS WITH UNKNOWN STATUS;
IF UNKNOWN = 1 THEN DELETE;
* REMOVE PATIENTS THAT DIED DUE TO OTHER CAUSE;
IF DIED_OTHER_CAUSE = 1 THEN DELETE;

* CREATE TREATMENT FACTOR VARIABLE;
LENGTH TREATMENT_CATEGORY $14;
IF TREATMENT = 0 THEN TREATMENT_CATEGORY = "NO TREATMENT";
ELSE IF TREATMENT_FAST = 1 THEN TREATMENT_CATEGORY = "FAST TREATMENT";
ELSE TREATMENT_CATEGORY = "SLOW TREATMENT";
* NEW AGE VARIABLE;
IF AGE >= 60 THEN AGE_60 = 1;
ELSE AGE_60 = 0;
* NEW CHARLSON VARIABLE;
IF CHARLSON > 4 THEN CHARLSON_4 = 1;
ELSE CHARLSON_4 = 0;
* NEW PITT VARIABLE;
IF PITT > 3 THEN PITT_3 = 1;
ELSE PITT_3 = 0;
RUN;</code></pre>

<p><strong>R approach:</strong></p>
<pre><code class="language-r"># Load required library
library(dplyr)
# Data wrangling
data_to_model <- raw_data |>
# Filter rows
filter(
UNKNOWN != 1,
DIED_OTHER_CAUSE != 1
) |>
# Create new columns
mutate(
TREATMENT_CATEGORY = case_when(
TREATMENT == 0 ~ "NO TREATMENT",
TREATMENT_FAST == 1 ~ "FAST TREATMENT",
TRUE ~ "SLOW TREATMENT"
),
AGE_60 = ifelse(AGE >= 60, 1, 0),
CHARLSON_4 = ifelse(CHARLSON > 4, 1, 0),
PITT_3 = ifelse(PITT > 3, 1, 0)
) |>
# Select columns
select(
AGE_60,
CHARLSON_4,
PITT_3,
TREATMENT_CATEGORY,
DIED_OF_DISEASE
)</code></pre>

<p>The R code uses the pipe operator (|>) to chain operations together, making the workflow more readable. Each step clearly shows what's happening to the data. In contrast, the SAS code uses a more procedural approach with separate statements for each operation.</p>

<h4>Step 3: Creating the logistic regression model</h4>
<p>We'll create a logistic regression model to predict the probability of dying from the disease based on:</p>

Age (dichotomized at 60 years)
Charlson score (dichotomized at score of 4)
Pitt score (dichotomized at score of 3)
Treatment category (no treatment, fast treatment, slow treatment)

<p><strong>SAS approach:</strong></p>
<pre><code class="language-sas">* MODELING;
PROC LOGISTIC DATA = WORK.DATA_TO_MODEL DESCENDING;
CLASS TREATMENT_CATEGORY (REF = "FAST TREATMENT") SEX (REF = "F") / PARAM = REFERENCE;
MODEL DIED_OF_DISEASE = AGE_60 CHARLSON_4 PITT_3 TREATMENT_CATEGORY / LINK = LOGIT SCALE = NONE;
RUN;</code></pre>

<p><strong>R approach:</strong></p>
<pre><code class="language-r"># Create model
model <- glm(formula = DIED_OF_DISEASE ~ .,
data = data_to_model,
family = binomial)

# Explore results
summary(model)

# Get odds ratio
exp(cbind(coef(model), confint(model, level = 0.95)))</code></pre>

<h4>Step 4: Exploring the results</h4>

<p>Both platforms provide detailed output for the logistic regression model, including coefficients, p-values, and odds ratios.</p>

<p>SAS automatically generates comprehensive output tables with odds ratios and confidence intervals:</p>

<p>In R, the <code>summary()</code> function provides the model coefficients and significance levels, while the additional exp(cbind()) function calculates odds ratios with confidence intervals:</p>

<h3>Clinical data science: key takeaways</h3>
<p>After comparing the approaches, here are the main observations:</p>

Both SAS and R produce identical statistical results
SAS code requires semicolons and explicit RUN statements, making it more prone to syntax errors
R code is generally more concise and readable, especially for data manipulation tasks
SAS output includes more information by default and has built-in styling
R requires an extra step to calculate odds ratios but offers more flexibility in how results are presented

<p>For clinical data science teams, the choice between SAS and R often comes down to regulatory requirements, existing workflows, and team expertise. Many organizations now use both: SAS for regulatory submissions and R for exploratory analysis and visualization.</p>

<blockquote><a href="https://www.appsilon.com/post/pharmaverse-tools-for-clinical-trials" target="_blank">Working with Clinical Trial Data? There’s a Pharmaverse Package for That</a></blockquote>

<h2 id="choice">Conclusion: How to Make the Right Choice</h2>
<p>After comparing SAS and R across multiple dimensions, you're probably wondering which one makes the most sense for your organization. Let's summarize the key differences and provide some practical guidance for making this important decision.</p>

<h3>SAS vs R: the final comparison</h3>
<p>Here's a quick summary of the major advantages of each platform:</p>

<b>Where R shines:</b>

Cost-effectiveness: R is free and open-source, eliminating licensing costs
Flexibility: R's package ecosystem offers tools for virtually any data science task
Innovation speed: New statistical methods typically appear in R first
Visualization: R provides superior customization options for charts and dashboards
Collaboration: Code sharing is easier when everyone can access the software for free
Talent pool: More universities now teach R, increasing the supply of skilled analysts
Integration: R works well with other modern data science tools and languages

<b>Where SAS shines:</b>

Enterprise support: Official technical support with guaranteed response times
Stability: More controlled development cycle means fewer breaking changes
Regulatory acceptance: Long history of use in regulated industries like pharmaceuticals
Point-and-click options: Tools like SAS Enterprise Guide for non-programmers
Standardized outputs: Consistent formatting across all analyses
Legacy system compatibility: Better integration with older enterprise systems

<h3>Making a strategic decision</h3>
<p>The right choice depends on your specific situation. Here are some scenarios and recommendations:</p>

<b>Consider staying with SAS if:</b>
<ul>
<li>Your regulatory environment strictly requires SAS for submissions</li>
<li>Your team has deep SAS expertise and minimal R experience</li>
<li>You have substantial investment in existing SAS code and workflows</li>
<li>You need guaranteed enterprise support with service level agreements</li>
<li>Budget constraints aren't a primary concern</li>
</ul>

<b>Consider switching to R if:</b>
<ul>
<li>You need to reduce software licensing costs</li>
<li>Your work requires cutting-edge statistical methods</li>
<li>You want to create highly customized or interactive visualizations</li>
<li>Your team includes or can hire R-skilled analysts</li>
<li>You need flexible deployment options including web-based dashboards</li>
<li>You want to tap into the innovation of the open-source community</li>
</ul>

<h3>Planning a successful transition</h3>
<p>If you decide to move from SAS to R, these steps can help ensure a smooth transition:</p>
<ol>
<li><strong>Start small</strong>: Begin with a pilot project that demonstrates R's capabilities</li>
<li><strong>Invest in training</strong>: Provide your team with time and resources to learn R</li>
<li><strong>Use both temporarily</strong>: Maintain SAS for critical systems while building R expertise</li>
<li><strong>Build a package library</strong>: Identify and install the R packages most relevant to your work</li>
<li><strong>Develop standards</strong>: Create coding standards and best practices for your team</li>
<li><strong>Consider expert help</strong>: Partner with R consultants for complex implementations</li>
</ol>

<p>Remember that transitioning doesn't have to be all-or-nothing. Many organizations successfully use both tools, leveraging each for its strengths.</p>
<p>If you want to learn more about transitioning to open-source, read these two successful cases from major pharma players:</p>
<ul>
<li><a href="https://www.appsilon.com/post/gsk-r-adoption-journey" target="_blank">GSK’s Open-Source Shift: Training 1,000 Biostatisticians in R</a></li>
<li><a href="https://www.appsilon.com/post/jj-open-source-journey" target="_blank">Open Source in Pharma: J&J's 5-Year Journey to R-Based Regulatory Submissions</a></li>
</ul>

<h3>The future of data analysis</h3>
<p>The data science landscape continues to evolve rapidly. While SAS has dominated statistical analysis for decades, R and Python have transformed how organizations approach data problems.</p>

<p>The trend is clear: open-source tools like R continue to gain ground across industries. Their flexibility, cost-effectiveness, and innovation pace make them increasingly attractive alternatives to traditional commercial software.</p>

<p>For forward-thinking organizations, incorporating R into your data science toolkit is no longer just an option—it's becoming a competitive necessity. The question isn't whether to use open-source tools, but how to integrate them effectively into your existing workflow.</p>

<blockquote>Ready to explore what R can do for your organization? <a href="https://appsilon.com/?utm_source=template_marketplace&utm_campaign=templates#contact" target="_blank" rel="noopener">Contact Appsilon</a> for expert guidance on implementing R and Shiny in your enterprise.</blockquote>

<h3>Final thoughts</h3>
<p>If you're looking to keep pace within your industry and create faster, more flexible data solutions, you should consider adding R to your toolkit. SAS still delivers value for many users, but R's open-source packages are rapidly becoming the standard for modern data science workflows. Don't get left behind!</p>

<p>The shift to R brings numerous advantages:</p>
<ul>
<li>Elimination of expensive licensing costs</li>
<li>Easier code sharing and collaboration</li>
<li>Faster access to cutting-edge methods</li>
<li>A growing talent pool of skilled analysts</li>
<li>Cleaner, more readable code syntax</li>
<li>Superior visualization capabilities</li>
</ul>

<p>Whether you're just starting your data science journey or looking to modernize an established analytics function, now is the perfect time to explore what R can do for your organization.</p>

<blockquote>And this is where Appsilon can help. <a href="http://appsilon.com/contact-us" target="_blank">Make sure to reach out to our team of experts</a>.</blockquote>

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!

Stop Struggling with Outdated Clinical Data Systems

Join pharma data leaders from Jazz Pharmaceuticals and Novo Nordisk in our live podcast episode as they share what really works when building modern, compliant Statistical Computing Environments (SCEs).

Save My Spot

Is Your Software GxP Compliant?

Download a checklist designed for clinical managers in data departments to make sure that software meets requirements for FDA and EMA submissions.

Get the Checklist

Ensure Your R and Python Code Meets FDA and EMA Standards

A comprehensive diagnosis of your R and Python software and computing environment compliance with actionable recommendations and areas for improvement.

Book the Audit