Skip to main content

Viva Insights Sample Code

Skills data join requirements

Development Environment Setup

For performing data joins and analyses, this tutorial covers example scripts in both the R and Python analysis environments. Here are the pre-requisites for them respectively:

Option 1: R Environment

To run the R code in this tutorial, you must have R and ideally a code editor installed:

Required R Packages:

The next step is to install the required R packages:

  • vivainsights: Microsoft’s R package for Viva Insights analysis
  • tidyverse: Collection of R packages for data science
  • here: Easy file path management

If these are not already installed, you can install them in R with:

install.packages(c("vivainsights", "tidyverse", "here"))

Option 2: Python Environment

To run the Python code in this tutorial, you must have Python and ideally a code editor installed:

Required Python Packages:

Ensure that you have the following python packages installed:

  • vivainsights: Microsoft’s Python package for Viva Insights analysis
  • pandas: Data manipulation and analysis library
  • numpy: Fundamental package for scientific computing
  • plotly: Interactive visualization library
  • scipy: Scientific computing tools

If not, you can install them in the Command Prompt with:

pip install vivainsights pandas numpy plotly scipy

Once the developer pre-requisites are satisfied, see how to load and join data.

Required Viva Insights schema and join keys

Before running the joins, confirm each input table is present and contains its join key(s). Missing or non-unique keys are the most common cause of failed or exploded joins:

Table Role Required key column(s) Notes
MetricOutput (Person Query) Main / left table PersonId, MetricDate, MetricPrimaryKey, PeopleHistoricalId Add any outcome metrics (e.g. Collaboration_hours, Total_Copilot_actions_taken) here. One row per person-week.
HR Org attributes PeopleHistoricalId One row per person-history; should be unique on the key.
PersonSkillsMappingMetadata Bridge MetricPrimaryKey, SkillHistoricalId One-to-many: expands rows to person-skill grain.
PersonSkills Skill instances SkillHistoricalId, SkillId Links a person’s skill record to the skills catalog.
SkillsLibrary Skills catalog SkillId One row per skill; should be unique on SkillId.

Minimal pre-join checklist:

  • ✅ Every table above loaded without error and is non-empty.
  • ✅ Each join key column exists and has no unexpected NAs.
  • ✅ The “one” side of each join (HR, SkillsLibrary) is unique on its key (df %>% count(key) %>% filter(n > 1) returns no rows).
  • ✅ You have recorded the row count of MetricOutput so you can verify expected row growth after each join.
Last updated: Jun 16, 2026 Edit this page on GitHub