R vs. Python
1 Notable Differences Between R and Python
The sections below are a quick reference for the most important differences in syntax and style between R and Python. Don’t worry about memorizing these right now—just know that they exist, and you’ll see examples of each in the next chapters.
1.1 Assignment Operator
R uses <- (preferred convention) or = for assignment.
Python Uses = for assignment.
x <- 10x = 101.2 Function Definition
R Functions are created with function() and assigned to a variable. Function bodies are wrapped in curly braces {}.
Python Functions are defined with the def keyword, followed by a colon and an indented body.
add_fun <- function(a, b) {
a + b
}def add_fun(a, b):
return a + b1.3 Return Values
R automatically returns the last evaluated expression. The return() function is optional.
Python requires an explicit return statement to return a value—otherwise returns None.
double_fun <- function(x) x * 2def double_fun(x):
return x * 21.4 Pipe Operator
R has a built-in pipe |> (base R) and the popular %>% operator from magrittr/dplyr for chaining operations.
Python has no native pipe operator. Chaining is done via method calls—polars is especially well-suited for this style.
data |>
filter(x > 5) |>
summarise(avg = mean(x))import polars as pl
(
data
.filter(pl.col("x") > 5)
.select(pl.col("x").mean().alias("avg"))
)1.5 String Formatting
R uses paste(), paste0(), sprintf(), or glue::glue() for string construction.
Python uses f-strings, the .format() method, or % formatting.
name <- "John"
paste0("Hello, ", name)
glue::glue("Hello, {name}")name = "John"
f"Hello, {name}"1.6 Code Block Structure
R uses curly braces {} to define code blocks. Indentation is purely stylistic.
Python uses indentation (typically 4 spaces) to define code blocks. Indentation is syntactically required.
if (x > 0) {
print("positive")
}if x > 0:
print("positive")1.7 Whitespace Rules
Spaces, line breaks, and indentation are flexible and stylistic in R
In Python, indentation is syntactic (inconsistent indentation raises an IndentationError). Blank lines are ignored but can be used to separate code blocks for readability.
# R — all of this is valid
# and produces the same result
# Clean, conventional style
if (x > 0) {
result <- x * 2
print(result)
}
# Messy but still valid
if(x>0){result<-x*2
print(result)}
# Extreme and ugly but works
if ( x > 0 ) {
result <-x*2
print( result )
}# Python — indentation matters
# Correct: consistent 4-space
# indentation
if x > 0:
result = x * 2
print(result)
# Blank lines are fine
# they improve readability
def process(x):
if x > 0:
result = x * 2
print(result)
return result
# Raises IndentationError
if x > 0:
result = x * 2
print(result) # ❌ bad indent1.8 Indexing
R uses 1-based indexing.
Python uses 0-based indexing.
vec <- c("a", "b", "c")
vec[1] # returns "a"lst = ["a", "b", "c"]
lst[0] # returns "a"1.9 Missing Values
R uses NA (with variants like NA_integer_, NA_character_), plus NULL and NaN.
Python uses None natively; polars represents missing values as null in its columnar data structures.
x <- c(1, NA, 3)
is.na(x)import polars as pl
s = pl.Series([1, None, 3])
s.is_null()1.10 Vectorization
In R, vectors are first-class citizens—most operations are vectorized by default.
In Python, native lists are not vectorized. Vector math requires numpy or columnar libraries like polars.
c(1, 2, 3) * 2
# returns c(2, 4, 6)import polars as pl
pl.Series([1, 2, 3]) * 2
# returns Series [2, 4, 6]1.11 Package Management
In R, packages are installed with install.packages() and loaded with library().
In Python, packages are installed via pip or conda from the terminal and loaded with import.
install.packages("dplyr")
library(dplyr)# terminal: pip install polars
import polars as pl1.12 Comments
In R, comments are single-line only, using #. No multi-line comment syntax.
In Python, single-line comments use #. Triple-quoted strings ("""...""") are commonly used as block comments or docstrings.
# This is a comment# This is a comment
"""
This is a multi-line
comment or docstring.
"""1.13 Boolean Values
In R, TRUE / FALSE (can be abbreviated as T / F).
In Python, True / False (capitalized, no abbreviation).
flag <- TRUEflag = True1.14 Logical Operators
R uses &, |, ! for vectorized logic; &&, || for scalar logic.
c(TRUE, FALSE) & c(TRUE, TRUE) # vectorized
TRUE && FALSE # scalarPython uses and, or, not for scalar logic; &, |, ~ for vectorized logic in numpy/polars expressions.
True and False # scalar
import polars as pl
df.filter((pl.col("a") > 0) & (pl.col("b") < 10)) # vectorized1.15 Data Frames
data.frame is built into base R; enhanced versions include tibble (tidyverse) and data.table.
df <- data.frame(
name = c("A", "B"),
value = c(1, 2)
)In Python, data frames are not built-in—requires a library like polars (pl.DataFrame), which offers fast, multithreaded performance and a clean expression API.
import polars as pl
df = pl.DataFrame({"name": ["A", "B"], "value": [1, 2]})