Introduction

This report allows you to review the quality of the Viva Insights data available and highlights specific issues that may require your attention before starting analysis. This report is structured in three sections:

  1. Viva Insights Settings
  2. Organizational Data Quality
  3. M365 Data Quality

The Microsoft Team behind Viva Insights has developed a series of data checks for each section. For the areas that have issues, we also provide you with suggestions to further clean up the data before performing additional analysis.

This report will automatically conduct certain data quality tests. Results will be indicated as a [Pass] or [Warning] throughout the report. [Warning] messages will direct you to items that need your attention and potential action.

For additional information about Viva Insights, including metric definitions, please visit our official documentation.


Data Available

Query Check

There are 100 employees in this dataset.

Date ranges from 2022-05-01 to 2022-07-03.

There are 6 (estimated) HR attributes in the data: PersonId, LevelDesignation, SupervisorIndicator, Organization, FunctionType, WeekendDays

There are 100 active employees out of all in the dataset.

1. Viva Insights Settings

1.1 Outlook Settings

Viva Insights uses the working days and hours settings from each measured Microsoft 365 Exchange mailbox to calculate collaboration metrics. This data allows the system to distinguish between collaboration activity (email, meetings, and Teams calls & IMs) that takes place during and outside of working hours.

The most frequent working hours set in Outlook in this dataset are the following:

[Note] Outlook hours analysis is unavailable as the data does not have the following variables: WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook

Abnormal Outlook settings (i.e. significant number of users defining very short or very long working days) may skew analysis results, making after-hours collaboration look particularly high or small.

[Note] Outlook hours analysis is unavailable as the data does not have the following variables: WorkingStartTimeSetInOutlook, WorkingEndTimeSetInOutlook

[Pass] The ratio of after-hours collaboration to total collaboration hours is outside the expected threshold for only 0 employees (0 % of the total).

  • 0 employees (0 %) have an unusually high after-hours collaboration (relative to weekly collaboration hours)
  • 0 employees (0 %) have an unusually low after-hours collaboration

If you believe abnormal Outlook settings are distorting your results, consider standardizing working hours across your analysis population. You can override Outlook settings by setting a global parameter in the Dependencies section of the Person query.

1.2 Meeting Exclusion Rules

Viva Insights uses email and calendar activities that are stored in a person’s Office 365 account to reveal internal and external collaboration trends. However, a person’s calendar and email can contain a diverse set of activities (such as personal meetings or appointments, social activities, all-day training meetings, and so forth) that are not relevant to work-related collaboration, and, if included in the metrics, would skew query results.

This section analyses the subject lines from the supplied meeting query, to identify if common exclusion terms are present in your data (e.g. happy hour, yoga class, team dinner, etc.). For more information, please visit meeting exclusion rules.

[Warning] 0 meetings ( 0% of 612 ) require your attention as they contain common exclusion terms.

If you believe that your meeting data requires further cleanup, please consider defining a new meeting exclusion rule under Settings in Viva Insights and re-running your queries. If you want to further investigate this issue, you can flag these meetings in your dataset using subject_validate(data, return = "data"). You can also generate a more detailed report using subject_validate_report().


2. Organizational Data Quality

Organizational data is descriptive information about the employees in your organization, such as the employee’s organization, job function, level, etc. This data has been uploaded by your organization’s Viva Insights Administrator. The quality of this information is important as it enables Viva Insights to attribute Office 365 data to specific groups, and slice the collaboration data in different ways to uncover relevant trends for your organization.

2.1 Attributes Available

The table below shows the organizational attributes available in this dataset. Use this table to understand the data’s quality and completeness.

  • Be mindful of attributes that have many unique values, as this may limit data aggregation and filtering (For example, if a job function or code is too narrowly defined, it might not give you a useful view of the overall group).
  • Additionally, review missing values as some attributes may only be partially available for the population in this sample:

2.2 Groups Under Privacy Threshold

To minimize privacy risk, queries from Viva Insights are anonymized. We also recommend that during analysis, collaboration patterns from teams or departments are reported in an aggregated way, respecting a minimum-group-size privacy threshold (your Viva Insights Administration has already defined a minimum-group-size rule that affects Explore charts and in Plans within Viva Insights).

The default minimum-group setting in this report is five, but this setting can be changed according to the privacy requirements of your organization. Re-run this report using the mingroup parameter to use a custom minimum group size setting.

[Pass] There is only 1 group under the minimum group size privacy threshold of 5.

The following groups are available in this dataset:

2.3 Distribution of Employees in Key Attributes

This section can help you understand the population scope for your analysis and validate the size of your selected grouping based on your business knowledge. Please note that this report will use Organization as the default grouping, but you can specify another attribute of your choosing as the hrvar input of the function validation_report(). Please note that a minimum threshold has not been applied to this section, providing a full list of attributes for your review.

2.4 Updates to Organizational Data

It is recommended for Viva Insights administrators to keep up to date the organizational data that is uploaded into Workplace Analytics. These updates help guarantee that relevant changes in the organizational structure are reflected in the system, that the collaboration data of new joiners is captured, and that all Office 365 data flows are attributed to the right teams and departments (even when some employees may change roles or be promoted internally).

The following chart shows the observed mobility of employees between teams in your organization. Lack of changes could indicate that the organizational data has not been frequently updated during the period under consideration.

2.5 Quality of Tenure Data

When the employee’s HireDate is available as an organizational data attribute, it can be used to calculate tenure of employees. This section does a quality check on the calculated tenure field, calculated as the employee’s last weekly collaboration date in your query minus the HireDate. The findings of the plot below will shed light on your analysis population’s tenure distribution.

[Note] Tenure analysis is unavailable as the data has no HireDate variable.

3. M365 Data Quality

This section evaluates the quality of the collaboration data available in your tenant, that is calculated from email, meeting, calls and IM flows within your company (all gathered from M365 Exchange and Teams). In general, collaboration data provides a very accurate description of the digital habits of employees and their daily experience as they interact with peers and other teams. However, data may not be available for all employees, or may only partially capture their day to day experience (for example, in teams that use other communication platforms or that interact face-to-face without planning meetings in advance).

3.1 Population Over Time

This section provides a view of the licensed population available in the query over time. Note that the values seen on this plot could differ from the actual licensed population due to filters on Activeness, holiday weeks, and non-knowledge workers.

3.2 Non-knowledge Workers

Non-knowledge workers refer to persons with unusually low average collaboration hours. These may represent individuals who are not required to collaborate via Outlook and Teams as part of their role or shift or may be part-time staff. Viva Insights data may not be representative of these individuals’ workday experience.

For this reason, we suggest excluding non-knowledge workers from your analysis. You can easily remove them from your dataset by using the function identify_nkw(return = "data_clean").

[Pass] There are no non-knowledge workers identified (average collaboration hours below 5 hours).

3.3 Company Holiday Weeks

Holiday weeks: these refer to weeks in the data where the collaboration hours of the sample are unusually low. These are typically removed from analysis as they represent public holidays where the patterns of collaboration are not representative of the norm. Note that this applies to weeks, i.e. the data of the week is removed for all employees in the sample.

The weeks where collaboration was 1 standard deviations below the mean (18.7) are: ``

You can easily remove holiday weeks from your dataset by using the function identify_holidayweeks(return = "data_cleaned").

3.4 Inactive Weeks

Inactive weeks are person-weeks in the data where the collaboration hours of the sample are unusually low. These are typically removed as they represent individual holidays where the patterns of collaboration are not representative of the norm. Note that this applies to person-weeks, i.e. the data is only removed for an individual for a given week if it is low for that employee.

There are 3 rows of data with weekly collaboration hours more than 2 standard deviations below the mean (18.7).

You can easily remove inactive weeks from your dataset by using the function identify_inactiveweeks(return = "data_cleaned").

3.5 Extreme Values

This section runs checks against the core collaboration metrics (Email, Meeting, Teams Call, and Teams Instant Message hours) to flag any extreme values. If a significant number of extreme high or low values is identified, the Analyst is recommended to investigate the cause before proceeding further with the analysis.

3.5.1 Extreme values: Email

[Pass] There are no persons where their average Email hours exceeds 80.

[Pass] There are no rows where their value of Email hours exceeds 80.

3.5.2 Extreme values: Meeting

[Warning] There are 3 persons where their average Meeting hours exceeds 80.

[Warning] There are 40 rows where their value of Meeting hours exceeds 80.

3.5.3 Extreme values: Calls

[Pass] There are no persons where their average Call hours exceeds 40.

[Pass] There are no rows where their value of Call hours exceeds 40.

3.5.4 Extreme values: IM

[Pass] There are no persons where their average Chat hours exceeds 40.

[Pass] There are no rows where their value of Chat hours exceeds 40.

3.5.5 Extreme values: Conflicting Meetings

[Warning] There are 2 persons where their average Conflicting meeting hours exceeds 70.

[Warning] There are 17 rows where their value of Conflicting meeting hours exceeds 70.

3.5.6 Extreme values: Zero collaboration

[Pass] There are no persons where their average Collaboration hours are equal to 0.

[Pass] There are no rows where their value of Collaboration hours are equal to 0.