Predicting Hospital Length of Stay

Implemented with Microsoft Machine Learning Services

PowerShell Instructions


If you have deployed a VM using the Deploy to Azure button on GitHub, all the steps below have already been performed and your database on that machine has all the resulting tables and stored procedures. Skip to the Typical Workflow for a description of how these files were first created in R by a Data Scientist and then deployed to SQL stored procedures.

If you are configuring your own server, or if you want to reset your VM to its initial state, continue with the steps below to run the PowerShell script.

Setup


First, make sure you have set up your SQL Server by following these instructions. Then proceed with the steps below to run the solution template using the automated PowerShell file.

Create Data and Train Model


Running this PowerShell script will create the data tables and stored procedures for the the operationalization of this solution, both in R (in the Hospital_R database) and Python (in the Hospital_Py database). It will also execute these procedures to create full database with results of the steps – dataset creation, modeling, and scoring as described here.

  1. Log onto the computer that contains the SQL Server you wish to use.

  2. Install Git if it is not already present. During the install, check the box to add LFS support.

  3. If you wish to install the sample website to demonstrate using the model, install node.js if it is not already present.

  4. Download HospitalSetup.ps1 to your computer. The best way of downloading the file is to clone the repository using git clone https://github.com/Microsoft/r-server-hospital-length-of-stay. You will find HospitalSetup.ps1 in the Resources/ActionScripts folder.

  5. Open a command or PowerShell window as Administrator.

  6. CD to the directory where you downloaded the above .ps1 file and execute the command:

    .\HospitalSetup.ps1

  7. Answer the prompts. Make sure to accept installation of NuGet if prompted.

  8. This will make the following modification to your SQL Server:

    • Installs the SQL Server PowerShell module. If this is already installed, it will update it if necessary.
    • Changes Authentication Method to Mixed Mode, which is needed in this version of the solution.
    • Creates the SLQRUserGroup for running R and Python code.
    • Reconfigures SQL Server to allow running of external scripts.
    • Creates a user with provided username and password
    • Elevates user’s credentials to SA.
    • Clones the solution code and data into the c:\Solutions\Hospital directory
    • Creates the solution database Hospital_R and configures an ODBC connection to the database.
    • Executes the stored procedure Initial_Run_Once_R to run the entire workflow with R for this solution.
    • If SQL Server 2017: creates the solution database Hospital_Py and configures an ODBC connection to the database.
    • If SQL Server 2017: Executes the stored procedure Initial_Run_Once_Py to run the entire workflow with Python for this solution.
    • Installs the sample website if node.js is installed.

Review Data


Once the PowerShell script has completed successfully, log into the SQL Server Management Studio to view all the datasets that have been created in the Hospital_R and Hospital_Py databases. Hit Refresh if necessary.

Click here to view more information about each of these tables.

Right click Boosted_Prediction and select View Top 1000 Rows to preview the scored regression results.

Click here to view the details all tables created in this solution.

Visualizing Results


You’ve now created and processed data, created models, and predicted LOS as described here. This PowerShell script also created the stored procedures that can be used to score new data in the future.

Let’s look at our current results. Proceed to Visualizing Results with PowerBI.

Sample Website for Native Scoring


This sample website shows how you might use the solution to show an estimate of the patient’s length of stay during the admission process.

Other Steps


You’ve just completed the fully automated solution that simulates the data, trains and scores the models by executing PowerShell scripts.

See the Typical Workflow for a description of how these files were first created in R by a Data Scientist and then incorporated into the SQL stored procedures that you just deployed.