vivainsights.create_lorenz¶
Calculate the Gini coefficient and plot the Lorenz curve for a given metric.
- vivainsights.create_lorenz.get_value_proportion(df, population_share)[source]¶
Look up the cumulative value share for a given population share.
- Parameters:
df (pandas.DataFrame) – DataFrame containing
cum_populationandcum_values_propcolumns (as produced bycreate_lorenz).population_share (float) – Cumulative population share, between 0 and 1.
- Returns:
The cumulative value proportion at the given population share.
- Return type:
float
- Raises:
ValueError – If population_share is not between 0 and 1.
Examples
>>> import vivainsights as vi >>> pq_data = vi.load_pq_data() >>> lorenz_table = vi.create_lorenz(data=pq_data, metric="Emails_sent", return_type="table") >>> vi.get_value_proportion(lorenz_table, population_share=0.5)
- vivainsights.create_lorenz.compute_gini(x)[source]¶
Compute the Gini coefficient for a numeric vector.
The Gini coefficient is a measure of statistical dispersion used to represent inequality in a distribution.
- Parameters:
x (list, numpy.ndarray, or pandas.Series) – Numeric values (e.g. hours, emails sent).
- Returns:
The Gini coefficient.
- Return type:
float
- Raises:
ValueError – If x is not a numeric vector.
Examples
>>> import vivainsights as vi >>> pq_data = vi.load_pq_data() >>> vi.compute_gini(pq_data["Emails_sent"])
- vivainsights.create_lorenz.create_lorenz(data, metric, return_type='plot', figsize=None)[source]¶
Calculate the Lorenz curve and Gini coefficient for a metric.
- Parameters:
data (pandas.DataFrame) – DataFrame containing the data to analyse.
metric (str) – Column name of the numeric values to analyse.
return_type (str, default "plot") –
"plot"to display a Lorenz curve,"gini"to return the Gini coefficient, or"table"for a DataFrame of cumulative shares.figsize (tuple of float or None, default None) – Figure size
(width, height)in inches. Defaults to(8, 6).
- Returns:
The Lorenz curve figure, the Gini coefficient, or a table of cumulative population and value shares.
- Return type:
matplotlib.figure.Figure, float, or pandas.DataFrame
- Raises:
ValueError – If metric is not in the DataFrame or return_type is invalid.
Examples
Compute the Gini coefficient:
>>> import vivainsights as vi >>> vi.create_lorenz(data=vi.load_pq_data(), metric="Emails_sent", return_type="gini")
Display the Lorenz curve plot:
>>> vi.create_lorenz(data=vi.load_pq_data(), metric="Emails_sent", return_type="plot")
Return a table of cumulative population and value shares:
>>> vi.create_lorenz(data=vi.load_pq_data(), metric="Emails_sent", return_type="table")
Customize the figure size:
>>> vi.create_lorenz( ... data=vi.load_pq_data(), ... metric="Collaboration_hours", ... return_type="plot", ... figsize=(10, 8), ... )