vivainsights.create_survival_prep¶
create_survival_prep: Convert Standard Person Query panel data into person-level survival analysis format (time-to-event + event indicator).
This is typically used as the first step before calling create_survival().
Example
>>> import vivainsights as vi
>>> from vivainsights.create_survival_prep import create_survival_prep
>>>
>>> pq_data = vi.load_pq_data()
>>> surv_data = create_survival_prep(
... data=pq_data,
... metric="Copilot_actions_taken_in_Teams",
... )
>>> surv_data.head()
With a custom event condition and HR attribute:
>>> surv_data = create_survival_prep(
... data=pq_data,
... metric="Copilot_actions_taken_in_Teams",
... event_condition=lambda x: x >= 10,
... hrvar="LevelDesignation",
... )
Pass the output directly to create_survival:
>>> import vivainsights as vi
>>> pq_data = vi.load_pq_data()
>>> surv_data = vi.create_survival_prep(pq_data, metric="Copilot_actions_taken_in_Teams")
>>> fig = vi.create_survival(surv_data, time_col="time", event_col="event", hrvar="Organization")
- vivainsights.create_survival_prep.create_survival_prep(data, metric, event_condition=<function <lambda>>, hrvar='Organization', id_col='PersonId', date_col='MetricDate')[source]¶
Name¶
create_survival_prep
Description¶
Convert a Standard Person Query panel dataset (multiple rows per person, one per period/week) into a person-level survival analysis table suitable for use with create_survival() or create_survival_calc().
For each person the function determines:
time: the number of observed periods until the event first occurred, or the total number of periods observed if the event never occurred (censored).
event: 1 if
event_conditionwas satisfied in at least one period, 0 if censored (condition never met within the observation window).
The HR attribute column (hrvar) is carried through using the most recently observed value per person (last row after sorting by date_col).
- param data:
Standard Person Query panel data. One row per person per period.
- type data:
pd.DataFrame
- param metric:
Numeric metric column to evaluate against event_condition.
- type metric:
str
- param event_condition:
A function that accepts a pandas Series of metric values and returns a boolean Series. The event is considered to have occurred at the first period where this condition is True.
Examples:
lambda x: x > 0— any non-zero activity (default)lambda x: x >= 10— at least 10 actions in a period
- type event_condition:
callable, default
lambda x: x > 0- param hrvar:
HR attribute column to carry through into the output (most recent observed value per person). Set to None to omit.
- type hrvar:
str or None, default “Organization”
- param id_col:
Column uniquely identifying each person.
- type id_col:
str, default “PersonId”
- param date_col:
Date/period column used to sort rows chronologically before computing the time-to-event. If absent, the existing row order is preserved.
- type date_col:
str, default “MetricDate”
- returns:
One row per person with columns:
id_col(e.g. “PersonId”)"time"— periods until event, or total observed periods if censored"event"— 1 (event occurred) or 0 (censored)hrvarcolumn, if supplied and present indata
- rtype:
pd.DataFrame
- raises KeyError:
If metric or id_col is not found in data.
- raises ValueError:
If event_condition does not return a boolean-compatible Series.
Notes
This function mirrors
create_survival_prep()in the R vivainsights package. The typical workflow is:surv_data = create_survival_prep(pq_data, metric="Copilot_actions_taken_in_Teams") fig = create_survival(surv_data, time_col="time", event_col="event")
Examples
>>> import vivainsights as vi >>> pq_data = vi.load_pq_data() >>> surv_data = create_survival_prep( ... data=pq_data, ... metric="Copilot_actions_taken_in_Teams", ... ) >>> surv_data.head()