Code
library(dplyr)
library(ggplot2)
library(patchwork)
library(fixest)
library(tinytable)

color_base <- "#007AB7"
color_accent <- "#E69F00"
color_accent2 <- "#009E73"
set.seed(1234)

2.1 Settings

Child penalty (Kleven, Landais, and Søgaard 2019) refers to the negative impact on labor market outcomes (e.g., wages, earnings) that individuals, particularly women, experience after having children. This penalty is often attributed to factors such as career interruptions, reduced work hours, and discrimination. In this section, I will compare several specifications to estimate the child penalty using simulated data.

Simulated data includes the following features:

  • Data is characterized by individuals \(i\) nested within groups \(g\) and observed over time \(t\)
  • Each individual has a birth year \(B_{g, i}\) and age at first childbirth \(A_{g, i}\)
  • First childbirth event occurs at \(E_{g, i} = B_{g, i} + A_{g, i}\)
  • Market outcomes \(Y_{g, i, t}\) are annually observed from 2000 to 2025
  • Market outcomes grow with human capital \(H_{g, i, t}\), which is affected by childbirth with a penalty \(\delta\)
  • Market outcomes are observed from age 20 to 60
  • An individual can have their first child between ages 20 and 40
  • If an individual’s first childbirth occurs after the end of the observation period (\(B_{g,i} + A_{g, i} > 2025\)), both \(A_{g,i}\) and \(E_{g,i}\) are treated as unobserved, and the relative time to event is set to \(k_{g,i,t} = -\infty\)

2.2 Data Generating Process

1. Group Types

For groups \(g=1,\dots,G\), let the type \(U_g\) be defined as

\[ U_g \sim \mathcal N(0,\sigma_u^2) \]

This \(U_g\) captures unobserved heterogeneity across groups affecting both event timing and market outcomes.

2. Event Timing

At each age \(a = 20, \dots, 40\), individual \(i\) in group \(g\) who has not yet had a child gives birth with conditional probability (hazard):

\[ h_{g, a} = \text{logit}^{-1}\left(\beta_0 - \lambda |a - \mu_g|\right), \]

where \(\mu_g = 30 + \alpha_1 U_g\) is the group-specific peak age. The age at first childbirth \(A_{g,i}\) is the first age at which the event occurs; individuals who do not give birth by age 40 are excluded from the sample.

3. Market Outcome

Human capital \(H_{g, i, t}\) is defined as

\[ H_{g, i, t} = t - B_{g, i} - 20 - \delta\mathbb{1}\left\{t \ge E_{g, i}\right\}. \]

Market outcome (wage rate) is generated as

\[ \log Y_{g, i, t} = \gamma_{g, i} + \gamma_1 H_{g, i, t} + \gamma_2 H_{g, i, t}^2 + \nu_{g, i, t}, \]

where \(\gamma_{g, i} \sim \mathcal{N}(\gamma_0 U_g, \sigma_{\gamma}^2)\) and \(\nu_{g, i, t} \sim \mathcal{N}(0, \sigma_{\nu}^2)\).

Here, “child penalty” will be observed through the human capital depreciation after childbirth \(\delta\).

2.3 Data Generation

Code
# Settings
N = 100
n_g <- 50
b_start <- 1960
b_end <- 2005
t_start <- 2000
t_end <- 2025
n_b <- b_end - b_start + 1
n_t <- t_end - t_start + 1
ages <- 20:40

# Parameters
sigma_u <- 1.0
alpha1 <- 2.0
beta0 <- -0.5
lambda <- 0.5
gamma0 <- 0.1
gamma1 <- 0.07
gamma2 <- -0.001
sigma_u <- 1.0
sigma_gamma <- 0.3
sigma_nu <- 0.1
delta <- 3.0

To simulate the data from 2000 to 2025, I generate individuals born between 1960 and 2005. Each individual is observed from age 20 to 60 within the observation period. There are \(G =\) 50 groups, and each group has \(N =\) 100 individuals for each birth year \(B =\) 1960 to 2005. Other parameters used in the data generating process are summarized in Table 2.1.

Code
tibble(
  Parameter = c(
    "$\\sigma_u$",
    "$\\alpha_1$",
    "$\\lambda$",
    "$\\gamma_0$",
    "$\\gamma_1$",
    "$\\gamma_2$",
    "$\\sigma_u$",
    "$\\sigma_\\gamma$",
    "$\\sigma_\\nu$",
    "$\\delta$"
  ),
  Value = c(
    sigma_u,
    alpha1,
    lambda,
    gamma0,
    gamma1,
    gamma2,
    sigma_u,
    sigma_gamma,
    sigma_nu,
    delta
  )
) |>
  tt() |>
  print(output = "markdown")
Table 2.1: Data Generating Process Parameters.
Parameter Value
\(\sigma_u\) 1.000
\(\alpha_1\) 2.000
\(\lambda\) 0.500
\(\gamma_0\) 0.100
\(\gamma_1\) 0.070
\(\gamma_2\) -0.001
\(\sigma_u\) 1.000
\(\sigma_\gamma\) 0.300
\(\sigma_\nu\) 0.100
\(\delta\) 3.000
Code
df <- tibble(
  g = 1:n_g,
  U = rnorm(n_g, 0, sigma_u),
) |>
  tidyr::uncount(weights = N * n_b, .id = "i") |>
  mutate(
    B = rep(rep(b_start:b_end, each = N), n_g),
    gamma_gi = rnorm(n(), gamma0 * U, sigma_gamma)
  ) |>
  mutate(
    A = {
      u_val <- U[1]
      mu_g <- 30 + alpha1 * u_val
      n_ind <- n()
      A_out <- rep(NA_integer_, n_ind)
      for (a in ages) {
        idx <- which(is.na(A_out))
        if (length(idx) == 0) {
          break
        }
        logit_h <- beta0 - lambda * abs(a - mu_g)
        h <- plogis(logit_h)
        born <- runif(length(idx)) < h
        A_out[idx[born]] <- a
      }
      A_out
    },
    .by = g
  ) |>
  filter(!is.na(A)) |>
  tidyr::uncount(weights = n_t, .id = "t") |>
  mutate(
    id = (g - 1) * N * n_b + i,
    t = t_start + t - 1,
    E = A + B,
    k = t - E,
    A = ifelse(E <= t_end, A, NA_integer_),
    E = ifelse(E <= t_end, E, NA_integer_),
    k = ifelse(!is.na(E), k, -Inf),
    H = t - B - 20 - delta * (k >= 0 & !is.na(k)),
    Y = exp(gamma_gi + gamma1 * H + gamma2 * H^2 + rnorm(n(), 0, sigma_nu)),
    age = t - B
  ) |>
  filter(between(age, 20, 60))

df |>
  ggplot(aes(x = A)) +
  geom_freqpoly(
    aes(y = after_stat(density)),
    binwidth = 1,
    color = color_base
  ) +
  labs(x = "Age at First Childbirth", y = "Density") +
  theme_minimal() +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  )
## Warning: Removed 177400 rows containing non-finite outside the scale range
## (`stat_bin()`).
Figure 2.1: Distribution of Age at First Childbirth.

Figure 2.1 shows the distribution of age at first childbirth \(A\). As designed in the data generating process, the distribution peaks around age 30.

Code
df |>
  summarize(wage = mean(Y), .by = c(A, age)) |>
  filter(A %in% c(25, 30, 35), between(age, 20, 45)) |>
  mutate(
    A = factor(A, levels = c(25, 30, 35), labels = c("25", "30", "35"))
  ) |>
  ggplot(aes(x = age, y = wage, color = A, group = A, linetype = A)) +
  geom_line() +
  scale_color_manual(values = c(color_base, color_accent, color_accent2)) +
  labs(
    x = "Age",
    y = "Average Wage",
    color = "Age at First Childbirth",
    linetype = "Age at First Childbirth"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    legend.position = "bottom"
  )
Figure 2.2: Average Wage by Age and Age at First Childbirth.

Figure 2.2 shows the average wage trajectory by age for individuals with different ages at first childbirth. We can observe wage drop after their first childbirth. An interesting feature of this simulation is that the average wage is higher for individuals who had their first child later (e.g., age 35) compared to those who had their first child earlier (e.g., age 25). This pattern arises because of the group heterogeneity \(U_g\) affecting both the timing of childbirth and wage levels.

2.4 Child Penalty Estimates

Kleven, Landais, and Søgaard (2019)

A common specification to estimate the child penalty is a cross-sectional comparison with age and year fixed effects as in Kleven, Landais, and Søgaard (2019):

\[ Y_{g, i, t} = \lambda_{a(g, i, t)} + \lambda_t + \sum_{k \neq -1, -\infty} \beta_k \mathbb{1}\{t - E_{g, i} = k\} + \varepsilon_{i, t}, \tag{2.1}\]

  • \(Y_{g, i, t} = \exp(w_{g, i, t})\): Wage
  • \(E_{g, i}\): Event time (first childbirth)
  • \(\lambda_{a}\): Age fixed effects
  • \(\lambda_{t}\): Year fixed effects
Code
mdl_kls <- feols(Y ~ i(k, ref = c(-1, -Inf)) | age + t, data = df)

plot_cp <- function(ctb, kname = "k", add_ref = TRUE) {
  ctb <- ctb |>
    as_tibble() |>
    mutate(
      rel_time = as.integer(gsub(paste0(kname, "::"), "", rownames(ctb))),
      coef = Estimate,
      se = `Std. Error`,
      lci = coef - 1.96 * se,
      uci = coef + 1.96 * se
    ) |>
    filter(between(rel_time, -3, 5))

  if (add_ref) {
    ctb <- ctb |>
      add_row(rel_time = -1, coef = 0, lci = 0, uci = 0)
  }

  ctb |>
    ggplot(aes(x = rel_time, y = coef, ymin = lci, ymax = uci)) +
    geom_pointrange(color = color_base) +
    geom_hline(yintercept = 0, linetype = "dashed") +
    labs(
      x = "Relative Time to First Childbirth",
      y = "Child Penalty"
    ) +
    theme_minimal() +
    theme(
      panel.grid.major.x = element_blank(),
      panel.grid.minor = element_blank()
    )
}

ctb <- coeftable(mdl_kls)
plot_cp(ctb)
Figure 2.3: Child Penalty Estimates using Kleven, Landais, and Søgaard (2019) Specification

Figure 2.3 shows the child penalty estimates using the specification in Kleven, Landais, and Søgaard (2019).1 The estimates show a clear negative effect on wages after childbirth, consistent with the true data generating process where childbirth causes a human capital depreciation of 3 years. However, we can observe negative pre-trends before childbirth: the coefficients decline from \(k = -3\) to \(k = -1\). This arises because Equation 2.1 is a cross-sectional comparison across individuals with different event ages \(A\) within the same age and year cell. As shown in Figure 2.2, the group heterogeneity \(U_g\) affects both the timing of childbirth and wage levels: groups with higher \(U_g\) tend to have later childbirth and higher wages. The not-yet-treated individuals (\(k = -\infty\)), who tend to come from groups with higher \(U_g\), serve as the baseline. Since Equation 2.1 does not control for this group-level heterogeneity, the comparison mixes selection effects with the treatment effect, producing a spurious pre-trend.

Two-Way Fixed Effects

Since Kleven, Landais, and Søgaard (2019)’s specification is a cross-sectional comparison, another approach is to utilize the panel structure of the data and estimate the child penalty as a two-way fixed effects event study:

\[ Y_{g, i, t} = \alpha_{g, i} + \lambda_t + \sum_{k \neq -1, -\infty} \beta_k \mathbb{1}\{t - E_{g, i} = k\} + \varepsilon_{g, i, t}, \]

  • \(\alpha_{g, i}\): Individual fixed effects
  • \(\lambda_{t}\): Year fixed effects
Code
mdl_uc <- feols(Y ~ i(k, ref = c(-1, -Inf)) | id + t, data = df)
ctb <- coeftable(mdl_uc)
plot_cp(ctb)
Figure 2.4: Child Penalty Estimates using Two-Way Fixed Effects Specification.

Figure 2.4 shows the child penalty estimates using the two-way fixed effects specification. Because we set the relative time to \(-\infty\) at the data generation stage for individuals whose first childbirth occurs after the observation period (\(E > 2025\)), these individuals naturally serve as a not-yet-treated control group that pins down the year fixed effects.

The estimates show a small positive pre-trend before \(k=-1\) and a sharp drop at \(k = 0\) of about \(-0.26\), which captures the child penalty. The post-event coefficients remain roughly stable around \(-0.26\), showing only slight attenuation at larger \(k\). The small positive pre-trend arises because the outcome \(Y\) is in levels while the true DGP is log-linear: the exponential mapping \(Y = \exp(w)\) creates mild curvature that the linear event-time specification picks up as an upward drift.

Age and Municipality Effect

To avoid the selection bias and control for the age effects, researchers may be interested in controlling them. A flexible way to control for these effects is to split the sample by birth year \(B_i\) and municipality \(g\) (and aggregate the coefficients). This is equivalent to including time fixed effects specific to birth year \(B_i\) and municipality \(g\).

\[ Y_{g, i, t} = \alpha_{g, i} + \lambda_{B(g, i), g, t} + \sum_{k \neq -1, -\infty} \beta_k \mathbb{1}\left\{t - E_{g, i} = k\right\} + \varepsilon_{g, i, t}. \tag{2.2}\]

Code
mdl_agefe <- feols(Y ~ i(k, ref = c(-1, -Inf)) | id + B^g^t, data = df)
ctb <- coeftable(mdl_agefe)
plot_cp(ctb)
Figure 2.5: Child Penalty Estimates using Age and Municipality Fixed Effects Specification.

Figure 2.5 shows the child penalty estimates using this specification. \(B(g, i) \times g \times t\) fixed effects successfully absorb the group-level heterogeneity. The results are qualitatively similar to the TWFE specification: a small positive pre-trend, a sharp drop at \(k = 0\) of about \(-0.26\), and post-event coefficients that gradually increase in magnitude (reaching about \(-0.29\) by \(k = 10\)). The mild pre-trend again reflects the curvature from fitting a linear event-study to a log-linear DGP in levels.

Imputation Method

Researchers may want to test the anticipation effects by checking whether there are any significant effects in the years leading up to childbirth. To test this, assume that there are no anticipation effects in 4 years prior to childbirth (\(k = \dots, -5, -4\)). Then, we can modify the event study specification as follows:

\[ Y_{g, i, t} = \alpha_{g, i} + \lambda_{B(g, i), g, t} + \sum_{k \ge -3} \beta_k \mathbb{1}\left\{t - E_{g, i} = k\right\} + \varepsilon_{g, i, t}. \]

This can be estimated by an imputation method (Borusyak, Jaravel, and Spiess 2024):

  1. \(Y_{g, i, t} \sim \alpha_{g, i} + \lambda_{B(g, i), g, t}\) to obtain \(\hat{Y}_{g, i, t}\) for untreated observations.
  2. Compute residuals \(\tilde{Y}_{g, i, t} = Y_{g, i, t} - \hat{Y}_{g, i, t}\).
  3. Estimate \(\tilde{Y}_{g, i, t} \sim \sum_{k \ge -3} \beta_k \mathbb{1}\left\{t - E_{g, i} = k\right\}\).
Code
mdl_imp1 <- df |>
  filter(is.finite(k) & k <= -4) |>
  feols(
    Y ~ 0 | id + B^g^t,
    combine.quick = FALSE
  )

df_imputed <- df |>
  mutate(
    Y_resid = Y - predict(mdl_imp1, newdata = df),
    k_bin = ifelse(k <= -4, -4, k)
  ) |>
  filter(!is.na(Y_resid) & is.finite(k))

mdl_imp2 <- feols(
  Y_resid ~ 0 + i(k_bin, ref = -4),
  data = df_imputed,
  cluster = ~g
)

ctb <- coeftable(mdl_imp2)
plot_cp(ctb, kname = "k_bin", add_ref = FALSE)
Figure 2.6: Child Penalty Estimates using Imputation Method

The imputation method exhibits negligible pre-trends, with coefficients at \(k = -3, -2, -1\) close to zero. The post-event estimates are stable around \(-0.26\) to \(-0.27\).

2.5 A Clean Way to Estimate Child Penalty

While the imputation method helps to test for anticipation effects, there is still a concern that the estimates may be contaminated by other cohorts if there are overlapping observation periods.

Code
df_imputed |>
  filter(between(k, -3, 5)) |>
  ggplot(aes(x = A, y = factor(k))) +
  ggridges::geom_density_ridges(
    stat = "binline",
    binwidth = 1,
    scale = 0.95,
    draw_baseline = FALSE,
    fill = color_base,
    alpha = 0.6
  ) +
  coord_flip() +
  labs(
    x = "Age at First Childbirth (A)",
    y = "Relative Time to First Childbirth (k)"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor = element_blank()
  )
Figure 2.7: Distribution of Age at First Childbirth by Relative Time

Figure 2.7 illustrates the distribution of age at first childbirth \(A\) by relative time to childbirth \(k\) in the imputed sample. The imputation method requires each individual to have at least one observation with \(k \le -4\) (i.e., \(E_i \ge 2004\)), which drops individuals whose event occurs too early. As a result, the age composition for each event time \(k\) may vary. With an observation window extending to 2025, the narrowing of the \(A\) distribution at larger \(k\) is less severe than it would be with a shorter panel, but the constraint still binds at the extremes.

This narrowing arises from the observation window constraint. For an individual with birth year \(B\) and event age \(A\), event time \(E = B + A\). To observe this individual at event time \(k\), we need \(t = E + k \in [2000, 2025]\) and \(A = t - B \in [20, 40]\) (the childbearing age range). At large positive \(k\), both the upper bound on \(t\) and the imputation requirement (\(E \ge 2004\)) restrict which combinations of \((B, A)\) can contribute, progressively excluding individuals with large \(A\).

An important message is that the researchers need to recognize the composition of age at event \(A\) for each relative time \(k\). An honest way to report the child penalty estimates is to show the estimates separately by age at event \(A\).

Code
ctb <- df_imputed |>
  filter(between(A, 23, 31)) |>
  feols(Y_resid ~ 0 + i(k_bin, ref = -4), cluster = ~g, split = ~A) |>
  coeftable()

ctb |>
  as_tibble() |>
  tidyr::separate(coefficient, into = c(NA, "rel_time"), sep = "::") |>
  select(A = sample, rel_time, est = Estimate, se = `Std. Error`) |>
  mutate(lbl_facet = paste0("Age at First Childbirth: ", A)) |>
  filter(between(as.integer(rel_time), -3, 5)) |>
  ggplot(aes(
    x = as.integer(rel_time),
    y = est,
    ymin = est - 1.96 * se,
    ymax = est + 1.96 * se,
  )) +
  geom_pointrange(color = color_base) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  facet_wrap(~lbl_facet) +
  labs(
    x = "Relative Time to First Childbirth",
    y = "Child Penalty"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )
Figure 2.8: Child Penalty Estimates by Age at First Childbirth. Estimates are shown separately by age at first childbirth.

To report the overall child penalty estimates, researchers can aggregate the estimates by age at event \(A\) using explicit weights. A natural choice is the number of individuals at each event age, which reflects the composition of the population.

Code
n_by_A <- df_imputed |>
  filter(between(A, 23, 31)) |>
  distinct(id, A) |>
  count(A)

ctb |>
  as_tibble() |>
  tidyr::separate(coefficient, into = c(NA, "rel_time"), sep = "::") |>
  select(A = sample, rel_time, est = Estimate, se = `Std. Error`) |>
  filter(between(as.integer(rel_time), -3, 5)) |>
  mutate(A = as.integer(A)) |>
  left_join(n_by_A, by = "A") |>
  summarize(
    est = weighted.mean(est, w = n),
    se = sqrt(sum((n / sum(n))^2 * se^2)),
    .by = rel_time
  ) |>
  ggplot(aes(
    x = as.integer(rel_time),
    y = est,
    ymin = est - 1.96 * se,
    ymax = est + 1.96 * se,
  )) +
  geom_pointrange(color = color_base) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(
    x = "Relative Time to First Childbirth",
    y = "Child Penalty"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )
Figure 2.9: Aggregated Child Penalty Estimates. Estimates are aggregated over age at first childbirth, weighted by the number of individuals at each event age.

Balanced Cohort Composition

The aggregation above still suffers from varying cohort composition across event times. To eliminate this issue entirely, we restrict the sample to individuals who are observed at all event times \(k = -3, \dots, 5\). This ensures that the composition of individuals is identical across all \(k\).

Code
# Keep only individuals observed at all k in -3:5
balanced_ids <- df_imputed |>
  filter(between(A, 24, 31), between(k, -3, 5)) |>
  count(id) |>
  filter(n == 9) |>
  pull(id)

df_imputed_bal <- df_imputed |>
  filter(id %in% balanced_ids)

# Estimate by age at first childbirth and aggregate
ctb_bal <- df_imputed_bal |>
  filter(between(A, 24, 31)) |>
  feols(Y_resid ~ 0 + i(k_bin, ref = -4), cluster = ~g, split = ~A) |>
  coeftable()

n_by_A_bal <- df_imputed_bal |>
  filter(between(A, 24, 31)) |>
  distinct(id, A) |>
  count(A)
Code
ctb_bal |>
  as_tibble() |>
  tidyr::separate(coefficient, into = c(NA, "rel_time"), sep = "::") |>
  select(A = sample, rel_time, est = Estimate, se = `Std. Error`) |>
  mutate(lbl_facet = paste0("Age at First Childbirth: ", A)) |>
  filter(between(as.integer(rel_time), -3, 5)) |>
  ggplot(aes(
    x = as.integer(rel_time),
    y = est,
    ymin = est - 1.96 * se,
    ymax = est + 1.96 * se,
  )) +
  geom_pointrange(color = color_base) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  facet_wrap(~lbl_facet) +
  labs(
    x = "Relative Time to First Childbirth",
    y = "Child Penalty"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )
Figure 2.10: Child Penalty Estimates by Age at First Childbirth (Balanced Sample). Sample restricted to individuals observed at all event times \(k = -3, \dots, 5\).
Code
ctb_bal |>
  as_tibble() |>
  tidyr::separate(coefficient, into = c(NA, "rel_time"), sep = "::") |>
  select(A = sample, rel_time, est = Estimate, se = `Std. Error`) |>
  filter(between(as.integer(rel_time), -3, 5)) |>
  mutate(A = as.integer(A)) |>
  left_join(n_by_A_bal, by = "A") |>
  summarize(
    est = weighted.mean(est, w = n),
    se = sqrt(sum((n / sum(n))^2 * se^2)),
    .by = rel_time
  ) |>
  ggplot(aes(
    x = as.integer(rel_time),
    y = est,
    ymin = est - 1.96 * se,
    ymax = est + 1.96 * se,
  )) +
  geom_pointrange(color = color_base) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(
    x = "Relative Time to First Childbirth",
    y = "Child Penalty"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )
Figure 2.11: Aggregated Child Penalty Estimates with Balanced Cohort Composition. Sample restricted to individuals observed at all event times \(k = -3, \dots, 5\), ensuring identical composition across \(k\).

  1. In Kleven, Landais, and Søgaard (2019), the coefficients are normalized by the average wage imputed from the estimated \(\lambda_{a}, \lambda_t\).↩︎