TextClassification

Implemented with Microsoft Machine Learning Services

Template Contents


The following is the directory structure for this template:

  • Data This contains Data for scoring. Other data is downloaded during the solution workflow
  • R This contains the R code to prepare training/testing/evaluation set, train the multi-class classifier and evaluate the model.
  • Python This contains the Python code to prepare training/testing/evaluation set, train the multi-class classifier and evaluate the model.
  • SQLR Stored procedures in SQL implement the model training workflow with R code.
  • SQLPy Stored procedures in SQL implement the model training workflow with Python code.
  • Resources This directory contains other resources for the solution package.

Data


Data for training and testing will also be downloaded and added to this directory, so more files will be present once the solution has been run once.

File Description
News_To_Score Text file containing new data for scoring.

Model Development in R


File Description
TextClassificationR.ipynb Create features on the fly for the training and testing set, train model, make predictions, and evaluate the model in Jupyter notebook.
run_modeling_main.R Create features on the fly for the training and testing set, train model, make predictions, and evaluate the model.

Operationalize in SQL R


Stored procedures in SQL implement the model training workflow with R code.

File Description
Load_Data.ps1Loads all data for the solution if you'd like to create a second instance of the solution on the same server
execute_yourself.sqlRuns through all the steps of the solution
step0_create_tables.sqlCreate data tables, invoked in Load_Data.ps1
step1_create_features_train.sqlCreate features on the fly and train model
step2_score.sqlScores data with model created in step1
step3_evaluate.sqlEvaluates model created in step1

Model Development in Python


File Description
TextClassificationR.ipynb Create features on the fly for the training and testing set, train model, make predictions, and evaluate the model in Jupyter notebook.
run_modeling_main.py Create features on the fly for the training and testing set, train model, make predictions, and evaluate the model.

Operationalize in SQL Python


Stored procedures in SQL implement the model training workflow with Python code.

File Description
Load_Data.ps1Loads all data for the solution if you'd like to create a second instance of the solution on the same server
execute_yourself.sqlRuns through all the steps of the solution
step0_create_tables.sqlCreate data tables, invoked in Load_Data.ps1
step1_create_features_train.sqlCreate features on the fly and train model
step2_score.sqlScores data with model created in step1
step3_evaluate.sqlEvaluates model created in step1

Resources for the Solution Package


File Description
.\Resources\ActionScripts\ConfigureSQL.ps1Configures SQL, called from SetupVM.ps1
.\Resources\ActionScripts\CreateDatabase.sqlCreates the database for this solution, called from ConfigureSQL.ps1
.\Resources\ActionScripts\CreateSQLObjectsPy.sqlCreates the tables and stored procedures for this solution, called from ConfigureSQL.ps1
.\Resources\ActionScripts\CreateSQLObjectsR.sqlCreates the tables and stored procedures for this solution, called from ConfigureSQL.ps1
.\Resources\ActionScripts\TextClassificationSetup.ps1Configures SQL, creates and populates database
.\Resources\ActionScripts\SolutionHelp.urlURL to the help page

< Home